Business methods for assessing nucleic acids

ABSTRACT

The present invention is directed to methods and compositions for evaluating nucleic acids, methods of preparing such compositions, and applications and business methods employing such compositions and methods. In particular, the present invention provides business methods for operating a gene expression measurement service.

CROSS-REFERENCE

This application claims the benefit of U.S. Provisional Application No.60/646,157, filed Jan. 21, 2005, which is incorporated herein byreference in its entirety.

GOVERNMENT INTERESTS

Certain embodiments of the present invention were made under researchgrant number ES02679 and ES01247 from the National Institute of Health;Grant No. RR00044 from the Division of Research Resources, HealthInstitute Contract 91-2, and International Lead Zinc Organizationcontract CH61, who may have certain rights thereto. Certain embodimentsof the invention were made under Research Grant No. NIH CA85147, CA95806 and CA 103594 who may have certain rights thereto. Certainembodiments of the present invention were made under research grantnumber E01640 from the National Institute of Health who may have certainrights thereto.

BACKGROUND OF THE INVENTION

With the sequencing of the human genome comes the hope of acceleratingdrug development and discovering better diagnostic tests. This hope hasengendered a need to develop improved methods for multi-gene expressionmeasurement. Methods amenable to appropriate quality control, forexample, to meet regulatory guidelines, are particularly needed. Thepresent invention relates to compositions and methods directed toaddressing these hopes and needs.

Other methods and compositions directed thereto are provided in U.S.patent applications Ser. No. 10/109,349, filed Mar. 28, 2002, and U.S.Ser. No. 10/471,473; and U.S. Provisional Applications Ser. Nos.60/368,288 and 60/368,409, filed Mar. 28, 2002; 60/550,278, filed Mar.5, 2004 and 60/561,841, filed April 12, 2004.

SUMMARY OF THE INVENTION

A first aspect of the invention is a method comprising providing a firstsample comprising a first nucleic acid; amplifying said first nucleicacid; and obtaining a relationship wherein said relationship canenumerate less than about 1,000 molecules of said first nucleic acid insaid first sample. In some embodiments of the invention saidrelationship can enumerate less than about 100 molecules, less thanabout 10 molecules, or less than about 1 molecule of said first nucleicacid in said first sample. In other embodiments, said relationshipcompares a first relationship of amplified product of said first nucleicacid to co-amplified product of a competitive template for said firstnucleic acid to a second relationship of amplified product of a secondnucleic acid in said first sample to co-amplified product of acompetitive template for said second nucleic acid. Typically, the saidfirst nucleic acid and said competitive template for said first nucleicacid are co-amplified in a first vessel and said second nucleic acid andsaid competitive template for said second nucleic acid are co-amplifiedin a second vessel. The competitive template for the first or secondnucleic acid can comprise a sequence referenced in Table 4. In someembodiments, the second nucleic acid serves as a first reference nucleicacid. for example as a control for loading. The first reference nucleicacid can correspond to at least one gene selected from GADP, ACTB, andβ-actin. The relationship can further compare amplified product of anumber of other nucleic acid(s) to co-amplified product of competitivetemplate(s) for said number of other nucleic acid(s). At least one ofsaid other nucleic acids can serve as a second reference nucleic acid.The second reference nucleic acid can correspond to at least one geneselected from GADP, ACTB, and β-actin. In various embodiments, therelationship comprises a use of microfluidic capillary electrophoresis,an oligonucleotide array, mass spectrometry, or a chromatography. Insome embodiments, the relationship does not involve taking real-timemeasurements nor generation of a standard curve. The relationship cancontrol for sources of variation selected from cDNA loading,intra-nucleic acid amplification efficiency, inter-nucleic acidamplification efficiency, inter-specimen amplification efficiency,inter-sample amplification efficiency, and intra-sample amplificationefficiency. The relationship is capable of detecting less than about atwo-fold difference or less than about a one-fold difference. Therelationship is capable of detecting less than about an 80% difference,less than about a 50% difference, less than about a 30% difference, orless than about a 20% difference. In some embodiments, the relationshipis capable of detecting less than about a two-fold difference or lessthan about a one-fold difference in about 100 molecules or less or inabout 10 molecules or less of said first nucleic acid in said firstsample. The difference detected can be less than about an 80%difference, less than about a 50% difference, less than about a 30%difference, or less than about a 20% difference. In some embodiments,the relationship provides a coefficient of variation of less than about25%, less than about 20%, less than about 15%, less than about 10%, orless than about 5% between said first sample and a second sample of saidfirst nucleic acid. In some embodiments, the first and said secondsamples are amplified at different times, first and said second samplesare amplified in different laboratories, or the first and said secondsamples are provided from different subjects. The first nucleic acid cancomprise a sequence referenced in Table 1 or 2. The methods describedherein can reduce or eliminate the false negatives; preferably the falsepositives are reduced to a statistically insignificant number. Thenucleic acid can be an RNA molecule or a DNA molecule. Typically, therelationship is substantially constant beyond an exponential phase ofsaid amplification of said first nucleic acid.

Another embodiment is a method of assessing a first nucleic acidprovided in a first sample, comprising co-amplifying said first nucleicacid, a number of other nucleic acid(s), a competitive template for saidfirst nucleic acid and a competitive template(s) for said other nucleicacid(s) wherein said competitive templates are at known concentrationsrelative to one another, to produce first amplified product thereof;diluting said first amplified product; and further co-amplifying saiddiluted first amplified product of said first nucleic acid and of saidcompetitive template for said first nucleic acid, to produce secondamplified product thereof. Typically, the number is at least about oneother nucleic acid, at least about 100 other nucleic acids, or thenumber is at least about 1,000 other nucleic acids. Typically, thediluting produces at least about a 1 00-fold dilution, at least about a1,000-fold dilution, or at least about a 10,000-fold dilution.Preferably, the method enumerates less than about 1,000 molecules ofsaid first nucleic acid in said sample, less than about 100 molecules ofsaid first nucleic acid in said first sample, less than about 10molecules of said first nucleic acid in said first sample, or about 1molecule of said first nucleic acid in said first sample. Preferably, atleast one of said competitive templates comprises a sequence referencedin Table 4. The first nucleic acid can comprise a sequence referenced inTable 1 or 2. One of the other nucleic acids can serve as a firstreference nucleic acid, such as a control for loading. The firstreference nucleic acid can correspond to at least one gene selected fromGADP, ACTB, and 0-actin. In one embodiment, the method of assessingcomprises obtaining a first relationship, said first relationshipcomparing said second amplified product of said first nucleic acid tosaid second amplified product of said competitive template for saidfirst nucleic acid; obtaining a second relationship, said secondrelationship comparing said first amplified product of said firstreference nucleic acid to said first amplified product of saidcompetitive template for said first reference nucleic acid; andcomparing said first and said second relationships. In some embodiments,another one of said other nucleic acids serves as a second referencenucleic acid. The second reference nucleic acid can corresponds to atleast one gene selected from GADP, ACTB, and β-actin. In anotherembodiment, the method of assessing comprises obtaining a firstrelationship, said first relationship comparing said second amplifiedproduct of said first nucleic acid to said second amplified product ofsaid competitive template for said first nucleic acid; obtaining a thirdrelationship, said third relationship comparing said first amplifiedproduct of said second reference nucleic acid to said first amplifiedproduct of said competitive template for said second reference nucleicacid; and comparing said first and said third relationships. In someembodiments, the method further comprises diluting and furtherco-amplifying said diluted first amplified product of said firstreference nucleic acid and of said competitive template for said firstreference nucleic acid, to produce second amplified products thereof. Inyet another embodiment, the assessing comprises obtaining a firstrelationship, said first relationship comparing said second amplifiedproduct of said first nucleic acid to said second amplified product ofsaid competitive template for said first nucleic acid; obtaining afourth relationship, said fourth relationship comparing said secondamplified product of said first reference nucleic acid to said secondamplified product of said competitive template for said first referencenucleic acid; and comparing said first and said fourth relationships. Invarious embodiments, the relationship comprises a use of microfluidiccapillary electrophoresis, an oligonucleotide array, mass spectrometry,or a chromatography. In some embodiments, the relationship does notinvolve taking neither real-time measurements nor generation of astandard curve. The relationship can control for sources of variationselected from cDNA loading, intra-nucleic acid amplification efficiency,inter-nucleic acid amplification efficiency, inter-specimenamplification efficiency, inter-sample amplification efficiency, andintra-sample amplification efficiency. The assessing can detect lessthan about a two-fold difference, less than about a one-fold difference,less than about an 80% difference, less than about a 50% difference,less than about a 30% difference, or less than about a 20% difference.In some embodiments, the relationship provides a coefficient ofvariation of less than about 25%, less than about 20%, less than about15%, less than about 10%, or less than about 5% between said firstsample and a second sample of said first nucleic acid. In someembodiments, the first and said second samples are amplified atdifferent times, first and said second samples are amplified indifferent laboratories, or the first and said second samples areprovided from different subjects. The first nucleic acid can comprise asequence referenced in Table 1 or 2. The methods described herein canreduce or eliminate the false negatives; preferably the false positivesare reduced to a statistically insignificant number. The nucleic acidcan be an RNA molecule or a DNA molecule. Typically, the relationship issubstantially constant beyond an exponential phase of said amplificationof said first nucleic acid. In some embodiments, the samples are dilutedprior to amplification.

Another aspect of the invention is a method of assessing a first nucleicacid in a first sample, comprising providing a standardized mixturecomprising a competitive template for said first nucleic acid and acompetitive template for a second nucleic acid in said first samplewherein said competitive templates are at known concentrations relativeto each other; combining said first sample with said standardizedmixture; co-amplifying said first nucleic acid and said competitivetemplate for said first nucleic acid to produce first amplified productthereof; diluting said first amplified product; further co-amplifyingsaid diluted first amplified product of said first nucleic acid and ofsaid competitive template for said first nucleic acid, to produce secondamplified product thereof; and co-amplifying said second nucleic acidand said competitive template for said second nucleic acid to producefirst amplified product thereof. The first nucleic acid and saidcompetitive template for said first nucleic acid can be co-amplified ina first vessel and said second nucleic acid and said competitivetemplate for said second nucleic acid can be co-amplified in a secondvessel. Typically, the diluting produces at least about a 100-folddilution, at least about a 1,000-fold dilution, or at least about a10,000-fold dilution. The method can enumerate less than about 1,000molecules of said first nucleic acid in said first sample, less thanabout 100 molecules of said first nucleic acid in said first sample,less than about 10 molecules of said first nucleic acid in said firstsample, or about I molecule of said first nucleic acid in said firstsample. The standardized mixture can further comprise sufficient amountsof said competitive templates for assessing said first nucleic acid inmore than about 10⁶ other samples, more than about 10⁸ other samples,more than about 10¹⁰ other samples, more than about 10¹¹ other samples,or more than about 10¹² other samples. The standardized mixture canfurther comprise a number of other competitive template(s) for othernucleic acid(s) wherein said competitive template(s) are at knownconcentrations relative to one another; thereby allowing assessment ofsaid other nucleic acids in said first sample. The number of othercompetitive templates can be at least about 100 or at least about 1,000.In some embodiments, the second nucleic acid serves as a first referencenucleic acid, such as a control for loading. The first reference nucleicacid can correspond to at least one gene selected from GADP, ACTB, andβ-actin. In some embodiments, the method of assessing comprisesobtaining a first relationship, said first relationship comparing saidsecond amplified product of said first nucleic acid to said secondamplified product of said competitive template for said first nucleicacid; obtaining a second relationship, said second relationshipcomparing said first amplified product of said first reference nucleicacid to said first amplified product of said competitive template forsaid first reference nucleic acid; and comparing said first and saidsecond relationships. At least one of said other nucleic acids serves asa second reference nucleic acid and said second reference nucleic acidcan correspond to at least one gene selected from GADP, ACTB, andβ-actin. The method can further comprise co-amplifying said secondreference nucleic acid and said competitive template for said secondreference nucleic acid to produce first amplified product thereof. Theassessing can comprise obtaining a first relationship, said firstrelationship comparing said second amplified product of said firstnucleic acid to said second amplified product of said competitivetemplate for said first nucleic acid; obtaining a third relationship,said third relationship comparing said first amplified product of saidsecond reference nucleic acid to said first amplified product of saidcompetitive template for said second reference nucleic acid; andcomparing said first and said third relationships. The standardizedmixture can further comprise sufficient amounts of said competitivetemplates for assessing said first nucleic acid in more than about 10⁶other samples, more than about 10⁸ other samples, more than about 10¹⁰other samples, more than about 10¹¹ other samples, or more than about10¹² other samples. The method can further comprise diluting and furtherco-amplifying said diluted first amplified product of said firstreference nucleic acid and of said competitive template for said firstreference nucleic acid, to produce second amplified products thereof. Inone embodiment, the first nucleic acid and said competitive template forsaid first nucleic acid are further co-amplified in a first vessel andsaid first reference nucleic acid and said competitive template for saidfirst reference nucleic acid are further co-amplified in a secondvessel. In another embodiment, the assessing comprises obtaining a firstrelationship, said first relationship comparing said second amplifiedproduct of said first nucleic acid to said second amplified product ofsaid competitive template for said first nucleic acid; obtaining afourth relationship, said fourth relationship comparing said secondamplified product of said first reference nucleic acid to said secondamplified product of said competitive template for said first referencenucleic acid; and comparing said first and said fourth relationships. Invarious embodiments, the relationship comprises a use of microfluidiccapillary electrophoresis, an oligonucleotide array, mass spectrometry,or a chromatography. In some embodiments, the relationship does notinvolve taking neither real-time measurements nor generation of astandard curve. The relationship can control for sources of variationselected from cDNA loading, intra-nucleic acid amplification efficiency,inter-nucleic acid amplification efficiency, inter-specimenamplification efficiency, inter-sample amplification efficiency, andintra-sample amplification efficiency. The assessing can detect lessthan about a two-fold difference, less than about a one-fold difference,less than about an 80% difference, less than about a 50% difference,less than about a 30% difference, or less than about a 20% difference.In some embodiments, the relationship provides a coefficient ofvariation of less than about 25%, less than about 20%, less than about15%, less than about 10%, or less than about 5% between said firstsample and a second sample of said first nucleic acid. In someembodiments, the first and said second samples are amplified atdifferent times, first and said second samples are amplified indifferent laboratories, or the first and said second samples areprovided from different subjects. The first nucleic acid can comprise asequence referenced in Table 1 or 2. The methods described herein canreduce or eliminate the false negatives, preferably the false positivesare reduced to a statistically insignificant number. The nucleic acidcan be an RNA molecule or a DNA molecule.

Another aspect of the invention is a method for assessing a firstnucleic acid, comprising providing a series of serially-dilutedstandardized mixtures comprising a competitive template for said firstnucleic acid and a competitive template for a second nucleic acidpresent in a number of samples comprising said first nucleic acid,wherein said competitive templates are at known concentrations relativeto each other; combining one of said samples comprising said firstnucleic acid with a first one of said serially-diluted standardizedmixtures; co-amplifying said first nucleic acid and said competitivetemplate for said first nucleic acid to produce amplified productthereof; obtaining a first relationship, said first relationshipcomparing said amplified product of said first nucleic acid to saidamplified product of said competitive template for said first nucleicacid; determining whether said first relationship is within about 1:10to about 10:1; if not, repeating said combining, co-amplifying,obtaining and determining steps with a second one of saidserially-diluted standardized mixtures; co-amplifying said secondnucleic acid and said competitive template for said second nucleic acidto produce amplified product thereof; obtaining a second relationship,said second relationship comparing said amplified product of said secondnucleic acid to said amplified product of said competitive template forsaid second nucleic acid; and comparing said first and said secondrelationships. The method can further comprise diluting said amplifiedproduct of said first nucleic acid and said competitive template forsaid first nucleic acid; and further co-amplifying said dilutedamplified product to produce further amplified product thereof. Inaddition, the method can further comprise diluting said amplifiedproduct of said second nucleic acid and said competitive template forsaid second nucleic acid; and further co-amplifying said dilutedamplified product to produce further amplified product thereof. Thenumber of samples can comprise a series of serially-diluted samples ofsaid second nucleic acid. In one embodiment of the one of said samplesis selected to provide said second nucleic acid approximately calibratedto said competitive template for said second nucleic acid in said firstone of said serially-diluted standardized mixtures. In anotherembodiment of the method said first nuclei acid and said competitivetemplate for said first nucleic acid are co-amplified in a first vesseland said second nucleic acid and said competitive template for saidsecond nucleic acid are co-amplified in a second vessel. The secondnucleic acid can serve as a first reference nucleic acid, such as acontrol for loading. The first reference nucleic acid can be GADP, ACTB,or β-actin. The first reference nucleic acid can be present at twodifferent concentrations in two of said serially-diluted standardizedmixtures. The series of serially-diluted standardized mixtures canfurther comprise sufficient amounts of said competitive templates forassessing said first nucleic acid in more than about 10⁶ samples, morethan about 10⁸ samples, more than about 10¹⁰ samples, more than about10¹¹ samples, or more than about 10¹² samples. The series ofserially-diluted standardized mixtures can further comprise a number ofother competitive template(s) for other nucleic acid(s) wherein saidcompetitive template(s) are at known concentrations relative to oneanother, thereby allowing assessment of said other nucleic acid(s). Thisnumber can be at least about 100 other competitive templates or at leastabout 1,000 other competitive templates. The at least one of said othernucleic acids can serve as a second reference nucleic acid. The secondreference nucleic acid can correspond to at least one gene selected fromGADP, ACTB, and β-actin. The method of assessing can further compriseco-amplifying said second reference nucleic acid and said competitivetemplate for said second reference nucleic acid to produce amplifiedproduct thereof; obtaining a third relationship, said third relationshipcomparing said amplified product of said second reference nucleic acidto said amplified product of said competitive template for said secondreference nucleic acid; and comparing said first and said thirdrelationships. The series of serially-diluted standardized mixtures canfurther comprise sufficient amounts of said number of other competitivetemplate(s) for assessing said other nucleic acid(s) in more than about10⁶ samples, in more than about 10⁸ samples, or in more than about 10¹⁰samples. The series of serially-diluted standardized mixtures canfurther comprise sufficient amounts of said number of other competitivetemplate(s) for assessing said other nucleic acid(s). The series ofserially-diluted standardized mixtures can further comprise sufficientamounts of said number of other competitive template(s) for assessingsaid other nucleic acid(s) in more than about 10¹¹ samples or in morethan about 10¹² samples. The method can be performed such that saidfirst nucleic acid-and said other nucleic acid(s) vary in amount over arange of more than about 2 orders of magnitude. The method can detectless than about a two-fold difference over said range, less than about a50% difference over said range, or less than about a 20% difference oversaid range. In some embodiments of the method, said first nucleic acidand said other nucleic acid(s) vary in amount over a range of about 3 ormore orders of magnitude and said assessing detects less than about atwo-fold difference over said range, less than about a ⁵⁰%/o differenceover said range, or less than about a 20% difference over said range. Inother embodiments of the method, the first nucleic acid and said othernucleic(s) vary in amount over a range of about 4 or more orders ofmagnitude and the assessing detects less than about a two-folddifference over said range, less than about a 50% difference over saidrange or less than about a 20% difference over said range. In someembodiments, the first nucleic acid and said other nucleic acid(s) varyin amount over a range of about 6 or more orders of magnitude and theassessing detects less than about a two-fold difference over said range,less than about a one-fold difference over said range, less than aboutan 80% difference over said range, less than about a 50% difference oversaid range, less than about a 30% difference over said range, or lessthan about a 20% difference over said range. In some methods, the firstnucleic acid and said other nucleic acid(s) vary in amount over a rangeof about 7 or more orders of magnitude and said enumerating detects lessthan about a two-fold difference over said range, detects less thanabout a one-fold difference over said range, less than about an 80%difference, less than about a 50% difference, or less than about a 20%difference. The range can also be about 8, about 9, about 10, or about15. In some methods the about 7 orders of magnitude span about a 7-logrange of gene expression and said about 7 orders of magnitude caninclude about 10⁻³, about 10⁻², about 0.1, about 1, about 10, about 10²,about 10³, and about 10⁴ copies/cell. The first nucleic acid cancomprise a sequence referenced in Table 1 or 2. The competitive templatefor said first or said second nucleic acid can comprise a sequencereferenced in Table 4. The competitive template for said first nucleicacid can be at a series of concentrations relative to said competitivetemplate for said second nucleic acid. The series of concentrations canprovide 10-fold serial dilutions of said competitive template for saidfirst nucleic acid relative to said competitive template for said secondnucleic acid. At least two of said series of concentrations can spanabout one order of magnitude, at least two of said series ofconcentrations span about three orders of magnitude, or at least two ofsaid series of concentrations span about 6 orders of magnitude. Theseries of concentrations can include at least two concentrationsselected from about 10⁻¹¹ M, about 10⁻¹² M, about 10⁻¹³M, about 10⁻¹⁴M,about 10⁻¹⁵M, and about 10⁻¹⁶M. The series of concentrations can includeat least three concentrations selected from about 10⁻¹¹ M, about 10⁻¹²M,about 10⁻¹³ M, about 10⁻¹⁴ M, about 10⁻¹⁵M, and about 10⁻¹⁶M. The seriesof concentrations can include at least six concentrations of about 10⁻¹¹M, about 10⁻¹² M, about 10⁻¹³ M, about 10⁻¹⁴ M, about 10⁻¹⁵ M, and about10⁻¹⁶M. The first or said second relationship can be obtained with useof microfluidic capillary electrophoresis, an oligonucleotide array,mass spectrometry, or chromatography. In some embodiments of the methodthe first or said second relationship does not involve taking neitherreal-time measurements nor generation of a standard curve. Thestandardized mixtures can control for sources of variation such as cDNAloading, intra-nucleic acid amplification efficiency, inter-nucleic acidamplification efficiency, inter-specimen amplification efficiency,inter-sample amplification efficiency, and intra-sample amplificationefficiency. The standardized mixtures of said series can enumerate lessthan about 1,000 molecules of said first nucleic acid in one of saidsamples, less than about 100 molecules of said first nucleic acid in oneof said samples, less than about 10 molecules of said first nucleic acidin one of said samples, or about 1 molecule of said first nucleic acidin one of said samples. The standardized mixtures of said series canprovide a coefficient of variation of less than about 25%, less thanabout 15%, or less than about 10% between 2 of said samples comprisingsaid first nucleic acid. The method can be performed on samples obtainedfrom different subjects, different laboratories, or at different times.The method can reduce or eliminate false negatives to an insignificantnumber, such as to a statistically insignificant number. The method canbe computer implemented, the computer implementation comprisesinstructing a robotic handler to select said first one of saidserially-diluted standardized mixtures for combining. The computerimplementation can comprise obtaining said first relationship, such asdetermining an area under a curve. The computer implementation cancomprise instructing said robotic handler to select said second one ofsaid serially-diluted standardized mixtures based on said firstrelationship. The nucleic acid assessed can be an RNA molecule or a DNAmolecule.

Another aspect of the invention is a method for preparing a standardizedmixture of reagents, said reagents comprising sufficient competitivetemplate for assessing amounts of a number of nucleic acids in more thanabout 10⁶ samples wherein said standardized mixture allows directcomparison of said amounts between 2 of said samples. The number can betwo nucleic acids, is at least about 96 nucleic acids, at least about100 nucleic acids, or at least about 1,000 nucleic acids. The method caninvolve the use of reagents that are sufficient to assess said amountsin more than about 10⁸ samples, more than about 10¹⁰ samples, more thanabout 10¹¹ samples, or more than about 10¹² samples. The method canemploy reagents which further comprise a forward primer and/or a reverseprimer for priming amplification of said competitive template for saidnumber of nucleic acid(s). The competitive template, said forward primerand/or said reverse primer can comprise a sequence referenced in Table4. The forward primer and/or said reverse primer can have substantiallythe same annealing temperature as another forward primer and/or reverseprimer in said standardized mixture. The forward primer and/or saidreverse primer can allow for detection of about 600 molecules or less ofsaid nucleic acid(s), about 60 molecules or less of said nucleicacid(s), or about 6 molecules or less of said nucleic acid(s). At leastone of said nucleic acids can comprise a sequence referenced in Table 1or 2. One of said number of nucleic acids can serve as a first referencenucleic acid. The first reference nucleic acid can be a control forloading and can be GADP, ACTB, or β-actin. Also, another one of saidnumber of nucleic acids serves as a second reference nucleic acid. Thesecond reference nucleic acid can be a gene selected from GADP, ACTB,and β-actin. The assessing can be performed with use of microfluidiccapillary electrophoresis, an oligonucleotide array, mass spectrometry,or chromatography. In some embodiments, the does not involve takingneither real-time measurements nor generation of a standard curve. Thestandardized mixtures can control for sources of variation such as cDNAloading, intra-nucleic acid amplification efficiency, inter-nucleic acidamplification efficiency, inter-specimen amplification efficiency,inter-sample amplification efficiency, and intra-sample amplificationefficiency. The standardized mixtures of said series can enumerate lessthan about 1,000 molecules of said first nucleic acid in one of saidsamples, less than about 100 molecules of said first nucleic acid in oneof said samples, less than about 10 molecules of said first nucleic acidin one of said samples, or about 1 molecule of said first nucleic acidin one of said samples. The standardized mixtures of said series canprovide a coefficient of variation of less than about 25%, less thanabout 15%, or less than about 10% between 2 of said samples comprisingsaid first nucleic acid. The method can be performed on samples obtainedfrom different subjects, different laboratories, or at different times.The method can reduce or eliminate false negatives to an insignificantnumber, such as to a statistically insignificant number. The nucleicacid assessed can be an RNA molecule or a DNA molecule.

Another aspect of the invention is a method comprising preparing aseries of serially-diluted standardized mixtures of reagents, saidreagent comprising sufficient competitive template for assessing amountsof a number of nucleic acids in more than about 10⁶ samples wherein saidstandardized mixtures allow direct comparison of said amounts between 2of said samples. The number can be two nucleic acids, is at least about96 nucleic acids, at least about 100 nucleic acids, or at least about1,000 nucleic acids. The method can involve the use of reagents that aresufficient to assess said amounts in more than about 10⁸ samples, morethan about 10¹⁰ samples, more than about 10¹¹ samples, or more thanabout 10¹² samples. In some methods, the amounts can vary over a rangeof more than about 2 orders of magnitude, over a range of about 3 ormore orders of magnitude, over a range of about 4 or more orders ofmagnitude, over a range of about 6 or more orders of magnitude, or overa range of about 7 or more orders of magnitude. The series can allow fordetection of less than about a two-fold difference over said range, ofless than about a 50% difference over said range, of less than about a20% difference over said range. Also, about 8 or more orders ofmagnitude, about 9 or more orders of magnitude, about 10 or more ordersof magnitude, or about 15 or more orders of magnitude are encompassedherein. The about 7 orders of magnitude can span about a 7-log range ofgene expression and can include about 10⁻³, about 10⁻², about 0.1, about1, about 10, about 10², about 10³, and about 10⁴ copies/cell. The methodcan employ reagents which further comprise a forward primer and/or areverse primer for priming amplification of said competitive templatefor said number of nucleic acid(s). The competitive template, saidforward primer and/or said reverse primer can comprise a sequencereferenced in Table 4. The forward primer and/or said reverse primer canhave substantially the same annealing temperature as another forwardprimer and/or reverse primer in said standardized mixture. The forwardprimer and/or said reverse primer can allow for detection of about 600molecules or less of said nucleic acid(s), about 60 molecules or less ofsaid nucleic acid(s), or about 6 molecules or less of said nucleicacid(s). At least one of said nucleic acids can comprise a sequencereferenced in Table 1 or 2. The competitive templates can comprise afirst competitive template for a first one of said nucleic acids and asecond competitive template for a second one of said nucleic acidswherein said first competitive template is at a series of concentrationsrelative to said second competitive template. The second nucleic acidcan serve as a first reference nucleic acid, such as a control forloading and be GADP, ACTB, or β-actin. In the method the series ofconcentrations can provide 10-fold serial dilutions of said firstcompetitive template relative to said second competitive template. Atleast two of said series of concentrations span about one order ofmagnitude, about three orders of magnitude, or about 6 orders ofmagnitude. The series of concentrations can include concentrationsselected from about 10⁻¹¹M, about 10⁻¹²M, about 10⁻¹³ M, about 10⁻¹⁴ M,about 10⁻¹⁵ M, and about 10⁻¹⁶M. The assessing can be performed with useof microfluidic capillary electrophoresis, an oligonucleotide array,mass spectrometry, or chromatography. In some embodiments, the does notinvolve taking real-time measurements nor generation of a standardcurve. The standardized mixtures can control for sources of variationsuch as cDNA loading, intra-nucleic acid amplification efficiency,inter-nucleic acid amplification efficiency, inter-specimenamplification efficiency, inter-sample amplification efficiency, andintra-sample amplification efficiency. The standardized mixtures of saidseries can enumerate less than about 1,000 molecules of said firstnucleic acid in one of said samples, less than about 100 molecules ofsaid first nucleic acid in one of said samples, less than about 10molecules of said first nucleic acid in one of said samples, or about 1molecule of said first nucleic acid in one of said samples. Thestandardized mixtures of said series can provide a coefficient ofvariation of less than about 25%, less than about 15%, or less thanabout 10% between 2 of said samples comprising said first nucleic acid.The method can be performed on samples obtained from different subjects,different laboratories, or at different times. The method can reduce oreliminate false negatives to an insignificant number, such as to astatistically insignificant number. The nucleic acid assessed can be anRNA molecule or a DNA molecule.

Another aspect of the invention is compositions for use in the methodsdescribed herein. One embodiment is a composition comprising astandardized mixture of reagents, said reagents comprising sufficientcompetitive template for assessing amounts of a number of nucleic acidsin more than about 10⁶ samples wherein said standardized mixture allowsdirect comparison of said amounts between 2 of said samples. Anotherembodiment is a composition comprising a series of serially-dilutedstandardized mixtures of reagents, said reagent comprising sufficientcompetitive template for assessing amounts of a number of nucleic acidsin more than about 10⁶ samples wherein said standardized mixtures allowdirect comparison of said amounts between 2 of said samples.

Another aspect of the invention is a database comprising numericalvalues corresponding to amounts of a first nucleic acid in a number ofsamples wherein said numerical values are directly comparable betweenabout 5 of said samples. In the database the number can be at leastabout 10 samples, at least about 100 samples, at least about 1,000samples, at least about 5,000 samples, or at least about 10,000 samples.The samples can be obtained from different subjects, from differentspecies, from different laboratories, or at different times. In thedatabase the amounts can show a coefficient of variation of less thanabout 25%, less than about 20%, less than about 15%, less than about10%, or less than about 5% between said 2 samples. The database canfurther comprise numerical values corresponding to amounts of a numberof other nucleic acid(s) in said number of samples. The number can be atleast about 100 other nucleic acids, at least about 1,000 other nucleicacids, or at least about 10,000 other nucleic acids. The amounts can beobtained using microfluidic capillary electrophoresis, anoligonucleotide array, mass spectrometry, or chromatography. In someembodiments, the amounts are not obtained using neither real-timemeasurements nor generation of a standard curve. The numerical valuescan be corrected for sources of variation such as cDNA loading,intra-nucleic acid amplification efficiency, inter-nucleic acidamplification efficiency, inter-specimen amplification efficiency,inter-sample amplification efficiency, and intra-sample amplificationefficiency. The numerical values typically correspond to numbers ofmolecules of said first nucleic acid in said number of samples. In thedatabase, at least one of said numerical values can correspond to lessthan about 1,000 molecules, less than about 100 molecules, less thanabout 10 molecules, or to about 1 molecule of said first nucleic acid inat least one of said samples. The numerical values can correspond toless than about a two-fold difference or less than about a one-folddifference in said first nucleic acid between 2 of said samples. Thenumerical values can correspond to less than about an 80% difference,less than about a 50% difference, less than about a 30% difference, orless than about a 20% difference in said first nucleic acid between 2 ofsaid samples. The numerical values can vary over a range of more thanabout 2 orders of magnitude, over a range of about 4 or more orders ofmagnitude, over a range of about 6 or more orders of magnitude, or overa range of about 7 or more orders of magnitude. The about 7 orders ofmagnitude can span about a 7-log range of gene expression to magnitudeinclude about 10⁻³, about 10⁻², about 0.1, about 1, about 10, about 10²,about 10³, and about 10⁴ copies/cell. The numerical values typically donot comprise a statistically significant number of false positives. Thenumerical values can be used in at least one stage of drug developmentselected from drug target screening, lead identification, pre-clinicalvalidation, clinical trial and patient treatment. The pre-clinicalvalidation can be a bioassay and/or an animal study. In someembodiments, the direct comparison in the database does not use abioinformatics resource. The nucleic acid comprises an RNA molecule or aDNA molecule. The database can indicate a gene expression levelcorresponding to a biological state, such as a normal state or a diseasestate.

In some embodiments, the database comprises numerical indices, saidnumerical indices obtained by mathematical computation of 2 numericalvalues, said 2 numerical values corresponding to amounts of 2 nucleicacids in a number of samples wherein said numerical indices are directlycomparable between 5 of said samples. The numerical indices can indicatea biological state. In some embodiments, at least one numerical index isa balanced numerical index. The numerical index can be calculated bydividing a numerator by a denominator, said numerator corresponding tosaid amount of one of said 2 nucleic acids and said denominatorcorresponding to said amount of the other of said 2 nucleic acids. Thenumerator can correspond to a gene positively associated with saidbiological state and said denominator corresponds to a gene negativelyassociated with said biological state. In the database, said biologicalstate can be a disease state, a predisposition to a disease state, atherapeutic drug response, a predisposition to a therapeutic drugresponse, an adverse drug response, a predisposition to an adverse drugresponse, a drug toxicity, or a predisposition to a drug toxicity.

Yet another aspect of the invention is a method for obtaining anumerical index that indicates a biological state, comprising providing2 samples corresponding to each of a first biological state and a secondbiological state; assessing an amount of each of 2 nucleic acids in eachof said 2 samples wherein said assessing can enumerate less than about1,000 molecules of each of said 2 nucleic acids; providing said amountsas numerical values wherein said numerical values are directlycomparable between a number of samples; mathematically computing saidnumerical values corresponding to each of said first and said secondbiological states; and determining a mathematical computation thatdiscriminates said first and said second biological states, therebyobtaining said numerical index. The method of determining saidmathematic computation can involve a use of software. The 2 nucleicacids can be associated with said first biological state and not withsaid second biological state. The 2 nucleic acids can be positivelyassociated with said first biological state and the other of said 2nucleic acids is negatively associated with said first biological state.The mathematical computation can comprise dividing a numerator by adenominator, said numerator corresponding to said nucleic acidpositively associated with said first biological state and saiddenominator corresponding to said nucleic acid negatively associatedwith said first biological state. The first biological state can be adisease state and said second biological state is a non-disease state.The can be an angiogenesis related condition, an antioxidant-relatedcondition, an apotosis-related condition, a cardiovascular-relatedcondition, a cell cycle-related condition, a cell structure-relatedcondition, a cytokine-related condition, a defense response-relatedcondition, a development-related condition, a diabetes-relatedcondition, a differentiation-related condition, a DNA replication and/orrepair-related condition, an endothelial cell-related condition, anfolate receptor-related condition, an hormone receptor-relatedcondition, an inflammation-related condition, an intermediarymetabolism-related condition, a membrane transport-related condition, anoxidative metabolism-related condition, neurotransmission-relatedcondition, a cancer-related condition, a protein maturation-relatedcondition, a signal transduction-related condition, a stressresponse-related condition, a tissue structure-related condition, atranscription factor-related condition, a transport-related condition,or a xenobiotic metabolism-related condition. In some embodiments,direct comparison does not use a bioinformatics resource.

Another embodiment is a method comprising state and not with a secondbiological state; providing 2 samples corresponding to each of saidfirst biological state and said second biological state; assessing anamount of each of said 2 nucleic acids in each of said 2 samples whereinsaid assessing can enumerate less than about 1,000 molecules of each ofsaid 2 nucleic acids; and mathematically computing said amountscorresponding to each of said first and said second biological states todetermine a numerical index, said numerical index discriminating saidfirst and said second biological states.

Yet another embodiment is a method of identifying a biological statecomprising assessing an amount each of 2 nucleic acids in a firstsample, wherein said assessing can enumerate less than about 1,000molecules of each of said 2 nucleic acids in said first sample;providing said amounts as numerical values wherein said numerical valuesare directly comparable between a number of samples; and using saidnumerical values to provide a numerical index, whereby said numericalindex indicates said biological state.

Yet another embodiment is a method of identifying a biological statecomprising assessing an amount a nucleic acid in a first sample, whereinsaid assessing can enumerate less than about 1,000 molecules of saidnucleic acid in said first sample; and providing said amount as anumerical value wherein said numerical value is directly comparablebetween a number of other samples.

Other aspects of the invention include business methods. One embodimentis a business method comprising collecting a first specimen comprising afirst nucleic acid; measuring an amount of said first nucleic acid in afirst sample of said first specimen wherein said measuring can enumerateless than about 1,000 molecules of said first nucleic acid in said firstsample; and providing said amount as a numerical value wherein saidnumerical value allows direct comparison to an amount of said firstnucleic acid in a second sample. The first and said second samples canbe measured at different times or in different laboratories. The secondsample can be obtained from said first specimen or a second specimen.The first and said second specimens can be collected from differentsubjects or from different species. The measuring step can be performedat least about 100 times per day, at least about 1,000 times per day, orat least about 4,000 times per day. The first specimen can comprise atleast about 1,000 cells. The first specimen can comprise a humanspecimen, which can be collected without identifying information. Thecollecting information can comprise attesting to compliance withinvestigative protocol. The identifying information can be collected ata later time than said collection of said first specimen. Theinformation can be collected via a website. The method can furthercomprise identifying which of said selected nucleic acids electrophoresetogether. The amounts of said identified nucleic acids can beelectrophoresed simultaneously. The numerical value can be provided viae-mail. The assessing can comprise providing a standardized mixturecomprising a competitive template for said first nucleic acid and acompetitive template for a second nucleic acid in said first specimenwherein said competitive templates are at known concentrations relativeto each other; combining said standardized mixture with a first sampleof said specimen; co-amplifying said first nucleic acid and saidcompetitive template for said first nucleic acid to produce fistamplified product thereof; diluting said first amplified product;further co-amplifying said diluted first amplified product of said firstnucleic acid and of said competitive template for said first nucleicacid, to produce second amplified product thereof; and co-amplifyingsaid second nucleic and said competitive template for said secondnucleic acid to produce first amplified product thereof. The firstnucleic acid and said competitive template for said first nucleic acidcan be co-amplified in a first vessel and said second nucleic acid andsaid competitive template for said second nucleic acid are co-amplifiedin a second vessel. The method can further comprise obtaining a firstrelationship, said first relationship comparing said second amplifiedproduct of said first nucleic acid and said second amplified product ofsaid competitive template for said first nucleic acid; obtaining asecond relationship, said second relationship comparing said firstamplified product of said second nucleic acid and said first amplifiedproduct of said competitive template for said second nucleic acid; andcomparing said first and said second relationships. The second nucleicacid can serve as a reference nucleic acid. The standardized mixture canfurther comprise sufficient amounts of said competitive templates forassessing said first nucleic acid in more than about 10⁶ samples.

Business methods for drug development are also provided herein. Oneembodiments is a business method of improving drug development,comprising collecting a first specimen comprising a nucleic acid from afirst biological entity administered a candidate drug at first stage ofdrug development; collecting a second specimen comprising said nucleicacid from a second biological entity at a second stage of drugdevelopment; assessing an amount of said nucleic acid in each of saidfirst and said second specimen; directly comparing said amounts; andaltering a step of said drug development based on said comparison.Another embodiment is a business method of improving drug development,comprising providing a database comprising numerical valuescorresponding to amounts of a first nucleic acid in a number of sampleswherein said numerical values are directly comparable between 5 of saidsamples; collecting a first specimen comprising said first nucleic acidfrom a biological entity administered a candidate drug at a stage ofdrug development; assessing an amount of said first nucleic acid in afirst sample of said first specimen; directly comparing said amount toat least one of said numerical values in said database; and altering astep of said drug development based on said comparison. The first orsaid second biological entity is typically at least one entity selectedfrom a virus, a cell, a tissue, an in vitro culture, a plant, an animal,and a subject participating in a clinical trial. The first or saidsecond stage of drug development can be drug target screening, leadidentification, pre-clinical validation, clinical trial and/or patienttreatment. The pre-clinical validation can be a bioassay and/or ananimal study. The altering can comprise a stratification of a clinicaltrial. The stratification can involve identifying subjects to have areduced side effect. The altering can reduce the time for said drugdevelopment.

Yet another embodiment is a business method of improving drugdevelopment, comprising providing a database comprising numericalindices, said numerical indices obtained by mathematical computation of2 numerical values corresponding to amounts of 2 nucleic acids in anumber of samples wherein said numerical indices are directly comparablebetween 5 of said samples; collecting a first specimen comprising said 2nucleic acids from a biological entity administered a candidate drug ata stage of drug development; assessing an amount of each of said 2nucleic acids in a first sample of said first specimen; using said 2amounts to mathematically compute a first numerical index; directlycomparing said first numerical index to at least one of said numericalindices in said database; and altering a step of said drug developmentbased on said comparison.

BRIEF DESCRIPTION OF THE FIGURES

The novel features of the invention are set forth with particularity inthe appended claims. A better understanding of the objects, features andadvantages of the present invention will be obtained by reference to thefollowing detailed description that sets forth illustrative embodiments,in which the principles of the invention are utilized, and theaccompanying drawings of which:

FIG. 1 illustrates a table showing numerical values for a number ofnucleic acids corresponding to expression measurements for somecandidate carboplatin chemoresistant genes in primary non-small celllung cancer (NSCLC).

FIG. 2 illustrates a table showing numerical values for a number ofnucleic acids corresponding to expression measurements for a number ofgenes in lung donor airway epithelial cells.

FIG. 3 illustrates an overall “two-step” process for evaluating nucleicacids in some embodiments.

FIG. 4 illustrates a table providing numerical values for a number ofnucleic acids corresponding to expression measurements for a number ofgenes derived from Stratagene Human Reference RNA, measured usingembodiments of both two-step and non-two step approaches. Correspondingsequences used as a forward primer (F), a reverse primer (R) and acompetitive template (CT) for each of the genes are also provided(Sequence ID Nos. 1-282).

FIG. 5 illustrates a relationship between the amount of nucleic acidused in a PCR reaction and the number of copies of mRNA transcripts/cellthat can be measured for a given number of cells/PCR reaction.

FIG. 6 illustrates a standardized mixture used in some embodiments ofthe present invention.

FIG. 7 illustrates re-calculating numerical values based on a firstreference nucleic acid (β-acting) to numerical values based on a secondreference nucleic acid (cyclophilin).

FIG. 8 illustrates use of a series of standardized mixtures, accordingto some embodiments of the instant invention.

FIG. 9 illustrates using a nucleic acid serving as a reference tobalance a sample with a standardized mixture of a series ofserially-diluted standardized mixtures.

FIG. 10 illustrates a cDNA dilution that provides a reference nucleicacid (β-actin) in balance with 600,000 molecules of the referencenucleic acid competitive template in a standardized mixture.

FIG. 11 illustrate a series of serially-diluted standardized mixturesA-F comprising a series of concentrations of competitive templates fortarget nucleic acids (6,000,000; 600,000; 60,000; 6,000; 600 and 60molecules/μL, respectively) relative to a given concentration ofcompetitive template for a β-actin (600,000 molecules/μL).

FIG. 12 illustrates use of Mix E initially, based on the expressionlevels of most genes.

FIG. 13 illustrates a situation where the initial Mix used does notprovide competitive template for the target nucleic acid (c-myc)sufficiently in balance with the amount of target nucleic acid in thecDNA dilution used.

FIG. 14 illustrates selection of a subsequent mix, Mix C, based onresults obtained using the first Mix.

FIG. 15 illustrates the situation where the subsequent mix selected, MixC, does provide competitive template for the target nucleic acid (c-myc)sufficiently in balance with the amount of target nucleic acid in thecDNA dilution used.

FIG. 16 illustrates calculation of a “ratio of ratios” based on dataobtained using an appropriate Mix.

FIG. 17 illustrates a series of electropherograms for various genes.

FIG. 18 illustrates an overall system for assessing nucleic acids, oneor more steps of which may be computer implemented in variousembodiments.

FIG. 19 illustrates a non-linear relationship between amount ofamplified product of glutathione peroxidase GSH-Px (empty boxes) or ofglyceraldehyde-3-phosphate dehydrogenase GAPDH (solid boxes) and totalstarting amount of RNA for increasing amounts of RNA, e.g., beyond theexponential phase of amplification. Straight lines represent theoreticalamounts of PCR product (either GSH-Px or GAPDH) that would be obtainedif amplification remained exponential throughout the amplificationprocess.

FIG. 20 illustrates a linear relationship between the ratio of(amplified product of nucleic acid/co-amplified product of itscompetitive template) and total starting amount of RNA for first andsecond nucleic acids corresponding to GSH-Px (empty boxes) and GAPDH(solid boxes), respectively.

FIG. 21 illustrates that the relationship of (amplified product of firstnucleic acid/co-amplified product of its competitivetemplate)/(amplified product of second nucleic acid/co-amplified productof its competitive template) to total starting amount of RNA remainsconstant, or substantially constant, for the two different nucleic acidswhen amplified in accordance with various embodiments of the instantinvention.

FIG. 22 tabulates a number of sources of variation and control methods

FIG. 23 illustrates the control of one or more of sources of error insome embodiments compared to real-time RT-PCR in two different specimenin four different experiments.

FIG. 24 illustrates development and use of a database of numericalvalues of some embodiments described herein.

FIG. 25 illustrates use of numerical indices in identifying a biologicalstate

FIG. 26 illustrates the overall process relating to using micro-arrayscreens with embodiments of the instant invention.

FIG. 27 illustrates the overall process of some embodiments of abusiness method for evaluating nucleic acids.

FIG. 28 illustrates the overall process of some embodiments of abusiness method for improving drug development.

FIG. 29 illustrates experiments comparing a non-two step with a two-stepapproach, according to some embodiments of the instant invention.

FIG. 30 illustrates the results of experiments comparing a non-two stepwith a two-step approach, according to some embodiments of the instantinvention.

FIG. 31 is a graph showing the correlation of gene expression valuesobtained by either 96 gene two-step or non-two-step approaches.

FIG. 32 illustrates a method for designing competitive template for usein some embodiments of the instant invention.

FIG. 33 illustrates a calculation of gene expression based ondensitometric values for electrophoretically separated amplified productof GST NT and CT.

FIG. 34 illustrates negative photographs of the gels analyzed bydensitometry.

FIG. 35 illustrates a calculation based on densitometry values.

FIG. 36 illustrates that a similar increase in expression of the CYPIAIgene was observed in both Northern analysis and some embodiments ofmethods disclosed herein.

FIG. 37 illustrate results from experiments comparing some embodimentsof the instant invention with oligonucleotide microarray analysis.

FIG. 38 illustrates a greater linear dynamic range obtained using sometwo step embodiments vs. microarray analysis.

Each of these figures provides an illustration only, and is in no wayintended to be limiting with respect to the present invention. Forexample, those skilled in the art will readily appreciate variations andmodifications of the schemes illustrated based on the teachings providedherein.

DETAILED DESCRIPTION OF THE INVENTION

The present invention is directed to methods and compositions forevaluating nucleic acids, methods of preparing such compositions, andapplications and business methods employing such compositions andmethods. Some aspects of the present invention relate to improvementsupon the Willey and Willey et al. U.S. Pat. Nos. 5,043,390; 5,639,606;and 5,876,978.

I. Methods for Assessing a Nucleic Acid

One aspect of the present invention relates to methods for assessingamounts of a nucleic acid in a sample. In some embodiments, theinvention allows measurement of small amounts of a nucleic acid, forexample, where the nucleic acid is expressed in low amounts in aspecimen, where small amounts of the nucleic acid remain intact and/orwhere small amounts of a specimen are provided. For example, in someembodiments, practice of the invention assesses gene expression in smallsamples of biological specimens.

“Specimen” as used herein can refer to material collected for analysis,e.g., a swab of culture, a pinch of tissue, a biopsy extraction, a vialof a bodily fluid e.g., saliva, blood and/or urine, etc. that is takenfor research, diagnostic or other purposes from any biological entity.“Biological entity” as used herein can refer to any entity capable ofharboring a nucleic acid, including any species, e.g., a virus, a cell,a tissue, an in vitro culture, a plant, an animal, and/or a subjectparticipating in a clinical trial. “Sample” as used herein can refer tospecimen material used for a given assay, reaction, run, trial, and/orexperiment. For example, a sample may comprise an aliquot of thespecimen material collected, up to and including all of the specimen. Asused herein the terms assay, reaction, run, trial and/or experiment canbe used interchangeably. Some embodiments of the present invention canbe practiced using small starting amount of nucleic acid to yieldquantifiable amounts.

In some embodiments, the specimen collected may comprise less than about100,000 cells, less than about 10,000 cells, less than about 5,000cells, less than about 1,000 cells, less than about 500 cells, less thanabout 100 cells, less than about 50 cells, or less than about 10 cells.In some embodiments, methods of the present invention are capable ofassessing the amount of a nucleic acid present in a sample comprisingless than about 100,000 cells. For example, a sample from a biopsy maycomprise less than about 100,000 cells. In some embodiments, the methodis capable of assessing the amount of a nucleic acid in less than about10,000 cells, less than about 5,000 cells, less than about 1,000 cells,less than about 500 cells, less than about 100 cells, less than about 50cells, or less than about 10 cells. Small biological specimen can alsorefer to amounts typically collected in biopsies, e.g, endoscopicbiopsies (using brush and/or forceps), needle aspirate biopsies(including fine needle aspirate biopsies), as well as amounts providedin sorted cell populations (e.g., flow-sorted cell populations) and/ormicro-dissected materials (e.g., laser captured micro-dissectedtissues). For example, biopsies of suspected cancerous lesions in thelung, breast, prostate, thyroid, and pancreas, commonly are done by fineneedle aspirate (FNA) biopsy, bone marrow is also obtained by biopsy,and tissues of the brain, developing embryo, and animal models may beobtained by laser captured micro-dissected samples.

In some embodiments, assessing, evaluating and/or measuring a nucleicacid can refer to providing a measure of the amount of a nucleic acid ina specimen and/or sample, e.g., to determine the level of expression ofa gene. In some embodiments, providing a measure of an amount refers todetecting a presence or absence of the nucleic acid of interest. In someembodiments, providing a measure of an amount can refer to quantifyingan amount of a nucleic acid can, e.g., providing a measure ofconcentration or degree of the amount of the nucleic acid present. Insome embodiments, providing a measure of the amount of nucleic acidrefer to enumerating the amount of the nucleic acid, e.g., indicating anumber of molecules of the nucleic acid present in a sample. The nucleicacid of interest may be referred to as a target nucleic acid, and a geneof interest, e.g., a gene being evaluated, mat be referred to as atarget gene.

In some embodiments, methods of the present invention are capable ofenumerating less than about 1,000 molecules of a nucleic acid in asample, e.g., about 800, about 600, or about 400 molecules of thenucleic acid. In some embodiments, less than about 100 molecules, e.g.,about 60 molecules, preferably less than about 10 molecules, e.g., about6 molecules, or more preferably less than about 1 molecule of a nucleicacid can be enumerated in a sample. For example, in preferredembodiments, a single molecule of nucleic acid template can give rise todetectable amplified product. In some embodiments, methods of theinstant invention can measure less than about 10,000,000, less thanabout 5,000,000, less than about 1,000,000, less than about 500,000,less than about 100,000, less than about 50,000, less than about 10,000,less than about 8,000, less than about 6,000, less than about 5,000, orless than about 4,000 molecules of a nucleic acid in a sample. Thenumber of molecules of a nucleic acid can also be referred to as thenumber of copies of the nucleic acid found in a sample and/or specimen.

The practice of some embodiments of the present invention permits raretranscripts to be measured with statistical significance. For example,in some embodiments, the number of copies of a nucleic acidcorresponding to a gene transcript can be determined, e.g., the numberof copies/cell, where the gene is expressed in low copy number.Enumerating less than about 1,000 molecules can allow measurement ofless than about 10 copies/cell of at least 100 different genetranscripts in a small biological specimen, e.g., from the amount ofmaterial typically used to obtain one gene measurement, e.g., to measurethat few copies of a nucleic acid corresponding to one gene. In someembodiments, methods of the instant invention are capable of measuringand/or enumerating less than about 10 copies/cell of at least 100different gene transcripts in a small biological specimen, from theamount of material typically used to obtain one gene measurement. Insome embodiments, enumerating less than about 10,000 molecules can allowmeasurement of less than about 10 copies/cell of at least 10.0 differentgene transcripts in a small biological specimen, e.g., from the amountof material typically used to obtain one gene measurement, e.g., tomeasure that few copies of a nucleic acid corresponding to one gene.,

In still some embodiments, more measurements can be obtained from agiven specimen and/or sample, e.g., of the size typically used tomeasure that few copies of a nucleic acid corresponding to one gene. Forexample, practice of some embodiments of the invention disclosed hereincan measure and/or enumerate less than about 100, less than about 50,less than about 20, less than about 10, less than about 8, or less thanabout 5 copies/cell of at least about 20, at least about 50, at leastabout 80, at least about 100, at least about 120, at least about 150, orat least about 200 different nucleic acids in a sample, e.g.,corresponding to different gene transcripts.

The expressed material may be endogenous to the biological entity, e.g.,transcripts of a gene naturally expressed in a given cell type, or theexpressed material to be measured may be of an exogenous nature. Forexample, in some embodiments, methods of the present invention can beused to quantify transfected genes following gene therapy and/or areporter gene in transient transfection assays, e.g., to determine theefficiency of transfection (Morales, M. J., and Gottlieb, D. I., Apolymerase chain reaction-based method for detection and quantificationof reporter gene expression in transient transfection assays, AnalyticalBiochemistry, 210, 188-194 (1993)).

As used herein, “nucleic acid” can refer to a polymeric form ofnucleotides and/or nucleotide-like molecules of any length. In preferredembodiments, the nucleic acid can serve as a template for synthesis of acomplementary nucleic acid, e.g., by base-complementary incorporation ofnucleotide units. For example, a nucleic acid can comprise naturallyoccurring DNA, e.g., genomic DNA; RNA, e.g., mRNA, and/or can comprise asynthetic molecule, including but not limited to cDNA and recombinantmolecules generated in any manner. For example the nucleic acid can begenerated from chemical synthesis, reverse transcription, DNAreplication or a combination of these generating methods. The linkagebetween the subunits can be provided by phosphates, phosphonates,phosphoramidates, phosphorothioates, or the like, or by nonphosphategroups as are known in the art, such as peptide-type linkages utilizedin peptide nucleic acids (PNAs). The linking groups can be chiral orachiral. The polynucleotides can have any three-dimensional structure,encompassing single-stranded, double-stranded, and triple helicalmolecules that can be, e.g., DNA, RNA, or hybrid DNA/RNA molecules. Anucleotide-like molecule can refer to a structural moiety that can actsubstantially like a nucleotide, for example exhibiting basecomplementarity with one or more of the bases that occur in DNA or RNAand/or being capable of base-complementary incorporation. The terms“polynucleotide,” “polynucleotide molecule,” “nucleic acid molecule,”polynucleotide sequence” and “nucleic acid sequence,” can be usedinterchangeably with “nucleic acid” herein. In some specificembodiments, the nucleic acid to be measured may comprise a sequencecorresponding to a gene referenced in Table 1 or 2, in FIG. 1 or 2,respectively.

In some embodiments the specimen collected comprises RNA to be measured,e.g., mRNA expressed in a tissue culture. In some embodiments thespecimen collected comprises DNA to be measured, e.g., cDNA reversetranscribed from transcripts. In some embodiments, the nucleic acid tobe measured is provided in a heterogeneous mixture of other nucleic acidmolecules.

A. Two-Step Approach

In some embodiments, the present invention provides a method ofassessing a nucleic acid provided in a sample, comprising co-amplifyingthe nucleic acid, a number of other nucleic acid(s), a competitivetemplate for the nucleic acid and a competitive template(s) for theother nucleic acid(s), e.g., to produce first amplified product thereof.In some embodiments, first amplified product can be diluted and thenfurther co-amplified, e.g., to produce second amplified product thereof.Amplifying and then further amplifying nucleic acid and competitivetemplate for the nucleic acid may be considered as two rounds ofamplification and a process employing two rounds of amplification may bereferred to as a “two-step” process or “two-step” approach.

FIG. 3 schematically illustrates some embodiments of the overall“two-step” process described herein, e.g., where-the amplified nucleicacid is cDNA. Experimental details comparing a two-step approach to anon-two-step approach can be found in Example I below.

At step 301 of FIG. 3, for example, RNA can be extracted from specimencells or tissues.

At step 302 of FIG. 3, extracted RNA can be reverse transcribed toprovide cDNA. In some embodiments, the amplified nucleic acid is anucleic acid other than cDNA, as described above. In some embodiments,although reverse transcription efficiency may be variable, therepresentation of one nucleic acid in comparison to another in theresultant cDNA product may not be affected. That is, in someembodiments, the amount of cDNA of target nucleic acid compared with theamount of cDNA of a second nucleic acid (e.g., a second nucleic acidserving as a reference nucleic acid) can remain equivalent orsubstantially equivalent to amount of mRNA of target nucleic acidcompared with the amount of mRNA of the second nucleic acid.

At step 303 of FIG. 3, native cDNA and its competitive template areco-amplified in a first round of amplification. Native cDNA may compriseboth the target nucleic acid and one or more other nucleic acids, whichcan be co-amplified with a competitive template for the target nucleicacid and a competitive template for one or more of the other nucleicacids. For example, the cDNA may be serially diluted and one or moreserial dilutions then amplified.

In preferred embodiments, the competitive templates of at least twonucleic acids are at known concentrations relative to one another.“Competitive template” as used herein can refer to a nucleic acid thatcompetes with a target nucleic acid during an amplification reaction.That is, when present in a reaction mixture for amplifying the targetnucleic acid, the competitive template competes to serve as the templatefor such amplification. In some embodiments, for example, thecompetitive template for a given nucleic acid has a structure allowingits amplification to the same or substantially the same extent as thegiven nucleic acid. In preferred embodiments, a competitive template fora given nucleic acid can be amplified using one or more of the sameprimers as that of the given nucleic acid and/or amplifies with the sameor substantially the same efficiency as the given nucleic acid. Inpreferred embodiment a competitive template for a given nucleic acid isamplified using the same primers, shares sequence homology, and/oramplifies with the same or substantially similar efficiency as the givennucleic acid. In some embodiments, competitive templates are referred toas internal standards or as a competitive template internal standard.

The term “native template” as used herein can refer to nucleic acidobtained directly or indirectly from a specimen that can serve as atemplate for amplification. For example, it may refer to cDNA molecules,corresponding to a gene whose expression is to be measured, where thecDNA is amplified and quantified. In some specific embodiments, at leastone competitive template used comprises a sequence referenced in Table 4of FIG. 4.

The term “primer” generally refers to a nucleic acid capable of actingas a point of initiation of synthesis along a complementary strand whenconditions are suitable for synthesis of a primer extension product. Insome specific embodiments, at least one primer used comprises a sequencereferenced in Table 4 of FIG. 4. Also, the table in FIG. 4 shows theprimer sequence and position for several genes whose expression can bemeasured.

Preferably, the competitive template has a distinguishing feature fromthe target nucleic acid, allowing its amplified product to bedistinguished from the amplified product of the target nucleic acid. Forexample, the competitive template can comprise mutants of nucleic acidto be evaluated. Mutations can be point mutations, insertions,inversions, deletions or the like. For example, in some embodiments, acompetitive template comprises at least one nucleotide that is differentfrom the corresponding nucleotide in the nucleic acid to be evaluated.In some embodiments, the competitive template comprises at least abouttwo, at least about three, at least about 5, at least about 10, at leastabout 15, or at least about 20 nucleotides that are different. Longerdeletions, insertions, inversions, substitutions and/or otheralterations are provided in some embodiments. For example, artificiallyshortened competitive templates may be generated according to the methoddescribed by Celi et al., Nucleic Acids Res. 21:1047 (1993).

In some preferred embodiments, the competitive template comprises analteration that causes a loss and/or a gain of one or more cleavagesites in the competitive template compared to its corresponding nucleicacid. For example, a base may be substituted in a competitive templatesequence to result in the gain and/or loss of a restriction endonucleaserecognition site, chemical cleavage site, or other specific cleavagesite. Various programs may be used to identify and match one or two ormore base mismatch sequences for known recognition sites. For example,the Map program within Genetics Computer Group software package(Devereux et al., supra, 1984) may be used. In this program, cDNAsequences are obtained for a given nucleic acid, and then the sequenceis evaluated for the presence of one or two base pair mismatches forknown restriction endonucleases.

In some embodiments, the competitive template comprises an alterationthat causes a loss and/or a gain of one or more specific recognitionsites in the competitive template compared to its corresponding nucleicacid. For example, a base may be substituted in a competitive templatesequence to result in the gain and/or loss of a protein binding sitesuch as a transcription factor binding site. Other structural changesfor distinguishing amplified product of a competitive template fromamplified product of its corresponding nucleic acid will be apparent tothose of skill in the art and are also within the scope of the instantinvention.

Amplification can be achieved by any methods known in the art and/ordisclosed herein for amplifying nucleic acid molecules. When polymerasechain reaction (PCR) amplification is used, conditions can include thepresence of ribonucleotide and/or debxyribo-nucleotide di-, tri-,tetra-, penta- and/or higher order phosphates; primers for PCRamplification for at least one nucleic acid and its correspondingcompetitive template; and at least one polymerization-inducing agent,such as reverse transcriptase, RNA polymerase and/or DNA polymerase.Examples of DNA polymerases include, but are not limited to, E. coli DNApolymerase, Sequenase 2.0®, T4 DNA polymerase or the Klenow fragment ofDNA polymerase 1, T3, SP6 RNA polymerase, AMV, M-MLV, and/or Ventpolymerase, as well as ThermoSequenase™ (Amersham) or Taquenase™(ScienTech, St Louis, Mo.). Further examples include thermostablepolymerases isolated from Thermus aquaticus, Thermus thermophilus,Pyrococcus woesei, Pyrococcus furiosus, Thermococcus litoralis, andThermotoga maritima. The polymerization-inducing agent and nucleotidesmay be present in a suitable buffer, which may include constituentswhich are co-factors or which affect conditions such as pH and the likeat various suitable temperatures. PCR primers used are preferably singlestranded, but double-, triple- and/or higher order stranded nucleotidemolecules can be practiced with the present invention. As used herein“amplified product” can refer to any nucleic acid synthesized at leastpartly by base-complementary incorporation using another nucleic acid astemplate. An amplified product may also be referred to an ampliconand/or amplimer herein. Amplification may be carried out for a number ofcycles of PCR, e.g., at least about 10, at least about 20, at leastabout 30, at least about 35, at least about 40, or at least about 50cycles in some embodiments.

In some embodiments, more than one nucleic acid (and its correspondingcompetitive template) are co-amplified. In preferred embodiments, thenumber of other nucleic acids is at least one. In some embodiments, thenumber is at least about 50 other nucleic acid, at least 100 othernucleic acids, at least about 200, at least about 300 other nucleicacids, at least about 500 other nucleic acids, at least about 800 othernucleic acids, at least about 1,000 other nucleic acids, at least about5,000 other nucleic acids, at least about 10,000 other nucleic acids, atleast about 50,000 other nucleic acids, or at least about 100,000 othernucleic acids. A competitive template can be used for each additionalnucleic acid to be evaluated and, in preferred embodiments, a pluralityof nucleic acids in a sample can be measured simultaneously.

At least one of the other nucleic acids can serve as a reference nucleicacid. “Reference nucleic acid” as used herein can refer to a nucleicacid that is amplified as well as the nucleic acid to be evaluated. Thenucleic acid can be “normalized” to a reference nucleic acid. In someembodiments, the reference nucleic acid serves as a control for loading,e.g., to control for cDNA loaded into the reaction. For example, in somepreferred embodiments, the reference nucleic acid comprises a nucleicacid that is not expected to vary (or to vary significantly) among givenbiological specimen and/or in response to certain stimuli. For example,mRNA from a constitutively expressed gene may provide the referencenucleic acid. In some embodiments, known or potential housekeeping genesmay provide the reference nucleic acid, including but not limited tohuman, mouse and/or rat glyceraldehydes-3-phospate dehydrogenase (GAPDor GAPDH), β-actin, 28S RNA, 18S RNA, and/or other ribonuclear proteingenes. Other housekeeping genes that have been used as internalstandards in Northern analyses of gene expression may also be used. See,e.g., Devereux et al., Nucleic Acids Res. 12:387 (1984); Barbu et al.,Nucleic Acids Res. 17:7115 (1989). In some embodiments, a competitivetemplate for a reference nucleic acid may comprise a nucleic acid havinga sequence similar to either strand of cDNA of a housekeeping gene, buthaving a distinguishable feature as described above.

Many different genes can provide reference nucleic acids. The choice ofreference nucleic acid may depend on the tissues to be assayed and/orthe biological states being studied. For example, β-actin varies littleamong different normal bronchial epithelial cell samples (see, e.g.,Crawford, E. L., Khuder, S. A., Durham, S. J., et al. (2000) Normalbronchial epithelial cell expression of glutathione transferase P1,glutathione transferase M3, and glutathione peroxidase is low insubjects with bronchogenic carcinoma. Cancer Res. 60, 1609-1618), but itmay vary over about 100-fold in samples from different tissues, such asbronchial epithelial cells compared to lymphocytes.

At step 304 of FIG. 3, amplified product of native cDNA and competitivetemplate (obtained in round one) are diluted before furtheramplification in round two. In some embodiments, amplified product oftarget nucleic acid and its corresponding competitive template may bediluted. In some embodiments, amplified product of a reference nucleicacid and its corresponding competitive template may be diluted. Dilutingamplified product may be achieved by any techniques known in the artand/or described herein. For example, diluting may involve removal of analiquot of a mixture comprising first amplified product, and transfer toa vessel containing additional buffer. In some embodiments, dilutingproduces at least about a 1,000,000-fold dilution, at least about a500,000-fold dilution, at least about a 100,000-fold dilution, at leastabout a 50,000-fold dilution, at least about a 10,000-fold dilution, atleast about a 5,000-fold dilution, at least about a 1,000-fold dilution,at least about a 500-fold dilution, or at least about a 100-folddilution.

At step 305 of FIG. 3, diluted amplified product of native cDNA andcompetitive template (obtained in round one) are further amplified inround two. In some embodiments, diluted amplified product of a targetnucleic acid and its corresponding competitive template may be furtherco-amplified in a second round of amplification. In some embodiments,diluted amplified product of a reference nucleic acid and itscorresponding competitive template may be further co-amplified in asecond round of amplification. As mentioned above, the use of two roundsmay be referred to as a “two-step” approach. In some embodiments, targetnucleic acid and/or the reference nucleic acid can be subjected to morethan two rounds of amplification. For example, second amplified productof the target nucleic acid and its corresponding competitive templatemay be again diluted and further amplified and/or second amplifiedproduct of the reference nucleic acid and its corresponding competitivetemplate may be again diluted and further amplified.

Various nucleic acids and corresponding competitive templates may beamplified in a given vessel during round one and/or round two of atwo-step process. For example, in some embodiments, more than onenucleic acid (each with its corresponding competitive template) areco-amplified in a given vessel. In some embodiment, repeatamplifications are carried out with fewer different nucleic acids (eachwith its corresponding competitive template) in a given vessel. Forexample, in some preferred embodiments, amplified products are furtheramplified with primers for a nucleic acid corresponding to one gene. Forexample, co-amplifying diluted first amplified product of a nucleic acidand of the competitive template for the nucleic acid can be achieved byusing a primer pair for co-amplifying the particular nucleic acid andits corresponding competitive template dried onto the vessel used inround two. For example, primers for individual genes can be aliquottedinto individual reaction vessels and dried down, e.g., on 384-wellplates. Multiple plates loaded with primers (e.g., about 10, about 100,about 500 plates) can be prepared in advance. For example, in someembodiments, primers prepared this way are stable at 4° C. for months.

At step 306 of FIG. 3, amounts of amplified products can be compared. Insome embodiments, the amount of amplified product of a target nucleicacid is compared to the amount of amplified product of its competitivetemplate. In some embodiments, e.g., comparison involves obtaining arelationship, e.g., a first relationship reflecting the amplifiedamounts of target nucleic acid compared with the amplified amounts ofits competitive template. In preferred embodiments, this relationship isprovided as a ratio, e.g., a first ratio of the amount of amplifiedproduct of a nucleic acid to the amount of amplified product of itscompetitive template, e.g., where the nucleic acid and its competitivetemplate are co-amplified.

In some embodiments, the amount of amplified product of a target nucleicacid is compared to a reference nucleic acid. In preferred embodiments,the reference nucleic acid is itself compared to a competitive templatefor the reference nucleic acid. For example, in some embodiments, theamount of amplified product of a reference nucleic acid is compared tothe amount of amplified product of its competitive template. In someembodiments, e.g., this comparison involves obtaining a relationship,e.g., a second relationship reflecting the amplified amount of referencenucleic acid compared with the amplified amount of its competitivetemplate. In preferred embodiments, this relationship is provided as aratio, e.g., a second ratio of the amount of amplified product ofreference nucleic acid to the amount of amplified product of itscompetitive template, e.g., where the reference nucleic acid and itscompetitive template are co-amplified.

In preferred embodiments, comparison of the target nucleic acid to areference nucleic acid involves comparing the first and secondrelationships described above. For example a relationship reflecting howthe first relationship compares with the second relationship can beobtained. In some embodiments, this relationship compares the firstratio to the second ratio, e.g., as a ratio of the first and secondratios.

The adjectives “first,” “second,” “third” and so forth, as used herein,do not necessarily indicate any order of preference, importance,chronology, or degree of a quality, concentration, and/or amount. Ratherthe terms are used to differentiate nouns qualified by the adjectives,e.g., a first and a second ratio can mean two different ratios; a secondnucleic acid can mean a different nucleic acid to that referred to asthe first nucleic acid.

In a two-step process, amplified product obtained after the first orsecond (or higher) round for target nucleic acid (and its correspondingcompetitive template); and amplified product obtained after the first orsecond (or higher round) for reference nucleic acid (and itscorresponding competitive template) may be used in the comparisonsdescribed above. For example, in preferred embodiments, a firstrelationship is obtained comparing second amplified product of thetarget nucleic acid to second amplified product of the competitivetemplate for the target nucleic acid; a second relationship is obtainedcomparing first amplified product of reference nucleic acid to firstamplified product of competitive template for the reference nucleicacid; and the first and second relationships are compared. In morepreferred embodiments, the relationship obtained by comparing the firstand second relationships remains substantially constant beyond theexponential phase of amplification of the nucleic acid. Substantiallyconstant can refer to variations of ± about 1%, about 5, about 10%,about 15%, or about 20% of an absolute constant number.

In some embodiments, another one of the nucleic acids amplified canserve as a second reference nucleic acid. In such embodiments, measuringthe amount of target nucleic acid can comprise obtaining a thirdrelationship that compares the first amplified product of this secondreference nucleic acid to the first amplified product of competitivetemplate for the second reference nucleic acid; and comparing the firstand third relationships. Also, in some embodiments, data calculatedusing a first reference nucleic acid can be re-calculated relative tothat of another reference nucleic acid.

In some embodiments, using two or more reference nucleic acids canprovide an understanding of inter-specimen and/or inter-sample variationamong the reference nucleic acids. In some embodiments, for example,β-actin and GAPD can be used as first and second reference nucleicacids. For example, there is a significant correlation between the ratioof β-actin/GAPD expression and cell size (Willey, J. C., Crawford, E.L., and Jackson, C. M. (1998) Expression measurement of many genessimultaneously by quantitative RT-PCR using standardized mixtures ofcompetitive templates. Am. J. Respir. Cell Mol. Biol. 19, 6-17), whichmay make use of these 2 reference nucleic acids preferred in someembodiments. In some embodiments, any measured nucleic acid orcombination of nucleic acids, including all measured nucleic acids, canbe used as a reference. The number of genes that must be quantitiatedfor normalization to any of the nucleic acids measured to result inadequate normalization may vary depending on the samples being studied.

As mentioned above, in some embodiments, a two-step method may comprisetwo step amplification of the nucleic acid serving as a referencenucleic acid. In some such embodiments, a fourth relationship may beobtained comparing second amplified product of the reference nucleicacid to second amplified product of its competitive template. In someembodiments, the first and fourth relationships are compared, e.g., byobtaining a ratio of the first and fourth ratios. In still someembodiments, where the nucleic acid serving as a reference nucleic acidis amplified in two rounds, first amplified product of the targetnucleic and first amplified product of its competitive template can beused to obtain the first relationship, e.g, the first ratio.

Where the “two-step” approach is extended for more than two rounds ofamplification, second amplified product of a nucleic acid and of acompetitive template for the nucleic acid can be diluted and stillfurther amplified, e.g., to produce third amplified product thereof Thesteps of diluting and further amplifying may be repeated at least aboutonce, at least about twice, at least about 3 times, at least about 5times, at least about 10 times, at least about 20 times, at least about50 times, at least about 100 or more.

In some embodiments, comparing the first and second and/or first andthird and/or first and fourth relationships can provide a “ratio ofratios” corresponding to a numerical value. In some embodiments,numerical values for various measured nucleic acids, e.g., for variousgene expression measurements, are provided as a database, as describedin more detail below. For example, such a database can be used with geneexpression data in clinical diagnostic testing.

In some embodiments, obtaining the comparisons, e.g., the first, second,third and/or fourth ratios, involves measuring the amounts of amplifiedproduct of each of the nucleic acid, the competitive template fornucleic acid, the reference nucleic acid(s) and the competitivetemplate(s) for the reference nucleic acid. Any method capable ofquantifying nucleic acids having a distinguishable feature (e.g., havingdifferent sizes and/or sequences) can be used. Quantifying methods mayinvolve separating and/or isolating the amplified product, for example,by use of electrophoresis, solid phase hybridization such as arrays,mass spectrometry, chromatography, HPLC and/or other methods known inthe art for separating different nucleic acid molecules.

The electrophoresis used may be one or more of gel electrophoresis(e.g., agarose and/or polyacrylamide gel electrophoresis), capillaryelectrophoresis (e.g., using a capillary electrophoresis device like PE310 or a microfluidic CE device like Agilent 2100 or Calipertech AMS 90high-throughput system), and/or other types of electrophoresis devicesknown in the art. See, e.g., (G. Gilliland, S. Perrin, K. Blanchard andH. F. Bunn, Proc Natl. Acad. Sci. USA 87, 2725-2729 (1990); M. J.Apostolakos, W. H. Schuermann, M. W. Frampton et al., AnalyticalBiochemistry 213, 277-284 (1993)). Further, capillary electrophoresis(CE), in particular, microfluidic CE technology can allow measurement ofnucleic acid in very small volumes. See, e.g., T. S. Kanigan et al., inAdvances in Nucleic Acid and Protein Analyses, Manipulation, andSequencing, P. A. Limbach, J. C. Owicki, R. Raghavachari, W. Tan, Eds.Proc. SPIE 3926: 172, (2000). Other electrophoresis devices that may beused include, for example, Agilent or AB1 310. In some embodiments,separation of amplified product on agarose gel, a PerkinElmer 310 CE(ABI Prism 310 Genetic Analyzer), and a 2100 Bioanalyzer microfluidic CE(Agilent, Santa Clara, Calif., USA) were shown to provide statisticallysimilar and reproducible results. E. L. Crawford, L A. Warner, D. A.Weaver and J. C. Willey, Quantitative end-point RT-PCR expressionmeasurement using the Agilent 2100 Bioanalyzer and standardized RT-PCR.Agilent Application September 2001, 1-8.

Where amplified products are to be separated by electrophoresis, thesize of the competitive templates and/or reference nucleic acid(s) canbe selected to differ from that of the target nucleic acid. For example,in some embodiments, amplified product generated from the referencenucleic acid and the target nucleic acid are of sufficiently differentsizes to be separated by electrophoresis. Further, in some embodiments,amplified product generated from the competitive template for a givennucleic acid and the given nucleic acid are of sufficiently differentsizes to be separated by electrophoresis.

In some embodiments, a size difference is achieved by using acompetitive template for a given nucleic acid that is longer or shorterthat the given nucleic acid. In some embodiments, this size differentialcan be achieved by restriction endonuclease digestion of the amplifiedproduct where the competitive template differs from its correspondingnucleic acid by the addition or lack of a restriction endonuclease site.For example, in a specific embodiment, GAPD competitive templates wereprepared that separate from native GAPD on the basis of EcoRI or BamHIdigestion. Separation on the basis of other restriction endonucleasedigestion may also be used. Further, in some embodiments, the samerecognition site can be used for both the reference nucleic acid and thenucleic acid to be measured.

In addition, in some embodiments, the length of the amplified productafter restriction endonuclease digestion is a factor to be considered.For example, in certain embodiments, greater nucleic acid sizedifferences are preferred for adequate separation on agarose gels, e.g.,preferably about 40, about 50, about 80, about 100 or about 120 basepair differences.

Separated products may be quantified by any methods known in the artand/or described herein, including, for example, use of radiolabledprobes, autoradiography, and preferably by spectrophotometry and/ordensitometry, e.g., densitometry of ethidium bromide stained gels. Othermethods that may be used to quantify amplified product includechromatography, e.g., high-performance liquid chromatography (HPLC); gaschromatography; and/or mass spectrometry, e.g., matrix-assisted laserdesorption ionization-time-of-flight mass spectrometry (MALDI-TOF-MS)(An economic forecast for the gene expression markethttp://www.researchandmarkets.com/reports/5545).

In some embodiments, amplified products are measured using solid-phasehybridizations. Some embodiments, for example, comprise use of an array,including microbeads and/or microarrays. Arrays can include, forexample, oligonucleotide arrays, including cDNA, DNA, and/or RNAoligonucleotide arrays. Such arrays may comprise a macroarray, amicroarray (e.g., a microfluidic array), and/or a nanoarray. In someembodiments, the amplified product and/or the oligonucleotidehybridizing thereto may be labeled, e.g., with a detectable moiety. Forexample, one or more of the nucleotides in the amplification reactionmay be labeled with a detectable moiety. Detectable moieties that can beused include fluorescent moieties, radioactive moieties, quantum dots,and/or luminescent systems.

In some embodiments, arrays for use in the practice of the presentinvention comprise oligonucleotides immobilized on a solid support wherea first set of the immobilized oligonucleotides can bind to a sequenceof the amplified product of the nucleic acid that is not common to theamplified product of the competitive template for the nucleic acid andwhere a second set of the immobilized oligonucleotides can bind to asequence of the amplified product of the competitive template of thenucleic acid that is not common to the amplified product of the nucleicacid, for example, sequences that span the juncture between the 5′ endof the competitive template and the truncated, mis-aligned 3′ end of thecompetitive template (e.g., that can be prepared according to the methodof Celi). Amplified product of the nucleic acid and of the competitivetemplate for the nucleic acid can be allowed to bind to the array and aratio obtained from the two sets. In still some embodiments, thetwo-step approach can be practiced without the use of solid phasehybridizations, e.g., without the use of arrays.

The use of two rounds in preferred embodiments of a two-step process canlower the threshold amount of nucleic acid that can be measured in asample. The lower threshold of detection can be defined as the minimumamount of analyte that can be reliably detected above background. Thedetection limit can be defined as the lowest concentration or quantityof analyte that can be detected with reasonable certainty. Without beinglimited to a particular hypothesis and/or theory, there may be a minimumamount of cDNA that can be used to achieve a statistically significantmeasurement. Lower threshold of detection in gene expressionmeasurements may be considered in terms the minimal number of moleculesof cDNA in a reaction for amplification or the minimal number of cells.

FIG. 5 schematically illustrates how the amount of cDNA used in a PCRreaction has a direct relationship to the number of copies of mRNAtranscripts/cell that can be measured for a given number of cells used.The minimal number of cells then depends on mRNA copies/cell in asample, as well as the efficiency of RNA extraction and/or reversetranscription. For example, consider the number of cells to provide RNAsufficient to result in at least 10 molecules of cDNA for a particulargene. It generally is assumed that RNA extraction is close to about 100%whereas reverse transcription is about 10% efficient. Thus, if ahomogeneous population of cells is studied and each cell contains 10copies of mRNA for a gene, 1 copy per cell will remain after reversetranscription. Due to stoichiometric considerations, cDNA samplesincluded in a PCR reaction that contain less than about 10 molecules ofa transcript is questionable, in some types of PCR. In such embodiments,cDNA representing about 10 cells is preferably present in the PCRreaction, as illustrated in FIG. 5. If a heterogeneous cell populationis studied in which 1 cell out of 10 expresses a particular transcript,cDNA representing about 1,000 cells is preferably present in the PCRreaction.

In certain embodiments, the use of two rounds can overcome some of thelimitations illustrated in FIG. 5. Consider a typical about 10 μl cDNAsample representing about 1,000 cells and comprising about 6×10⁵molecules of β-actin nucleic acid. Genes expressed at the mean level(100-fold lower than β-actin), are represented by about 6,000 molecuesin the sample. A number of genes that may be important functionally areexpressed 10,000-fold lower than β-actin, and for such genes there wouldbe about 60 molecules represented in the sample. In a 100-fold smallersample of about 100 nanoliters, genes expressed 10,000-fold lower thanβ-actin would be represented by about 0.6 copies or fewer.

In certain embodiments of the instant invention, about 10 nanoliters ofan about 10 μl round one amplified product may be used in around tw6reaction volume of about 100 nanoliters. Because more than about1,000,000-fold amplification is routinely achieved in the round onereaction, about 10 nanoliters of the about 10 μl round one reaction willcontain ample amplified product of nucleic acid and competitive templateto be measured with statistical confidence after round two.

Further, in some preferred embodiments, the use of two rounds canincrease the number of measurements obtainable from a small sample ofnucleic acid. For example, in some embodiments, at least about 10,000,at least about 50,000, at least at about 80,000, at least about 100,000,at least about 150,000 nucleic acid measurements can be obtained fromthe same amount of starting nucleic acid typically used to obtain onemeasurement using the processes provided in Willey and Willey et al.'390, '606, and '978. In some embodiments, at least about 200,000, atleast about 500,000, at least at about 800,000, at least about1,000,000, or at least about 1,500,000 nucleic acid measurements can beobtained from the same amount of starting nucleic acid typically used toobtain one measurement using the processes provided in Willey and Willeyet al. '390, '606, and '978, preferably without loss of sensitivity todetect rare transcripts. For example, in some embodiments, sufficientamplified product can be generated to measure nucleic acidscorresponding to several genes in about 100 to about 1,000 cell samples.Using the processes provided in Willey and Willey et al. '390, '606, and'978, cDNA representing about 100 to about 1,000 cells is typically usedto measure one nucleic acid in one PCR reaction. Referring again to FIG.5, using this amount allows detection of transcripts that are expressedat about 0.1 to about 1 copy per cell (or about 1 to about 10 copies per10 cells) with statistical significance. The same amount of cDNA can beused in a first round of amplification in certain embodiments of theinstant invention. Since this cDNA is co-amplified with a competitivetemplate for the nucleic acid to be measured, and since the relationshipof endogenous cDNA to its competitive template remains constant orsubstantially constant, amplified product from round one can be dilutedand further amplified in a second round with primers specific to a givennucleic acid without significantly changing the relative amounts ofamplified product.

Further, in some embodiments, use of two rounds can increase the numberof nucleic acid that can be measured in a given sample. Someembodiments, for example, allow replicate measurement of many genes insmall amounts of specimen material.

In some embodiments, methods of the instant invention reduce falsepositives to a statistically insignificant number. In some embodiments,false negatives are reduced to a statistically insignificant number, andin some embodiments, eliminated. For example, where a competitivetemplate is used in a number of nucleic acid measurements, there may beno false negatives and a statistically insignificant number of falsepositives.

B. Use of a Standardized Mixture

In some embodiments, the two-step approach of assessing a nucleic acidin a sample can comprise use of a standardized mixture “Standardizedmixture” as used herein can refer to a mixture comprising a number ofinternal standards, e.g., a number of competitive templates, at knownconcentrations. In preferred embodiments, the standardized mixturecomprises a competitive template for at least one target nucleic acidand a competitive template for at least one reference nucleic acid in asample, where the competitive templates are at known concentrationsrelative to each other. In more preferred embodiments, the competitivetemplates are at fixed concentrations relative to other, up to andincluding all other, competitive templates in the mixture.

FIG. 6 illustrates a standardized mixture used in some embodiments ofthe present invention. Feature 601 illustrates a sample, Sample A, whichcomprises a number of nucleic acids to be measured, corresponding toGenes 1-6-n, as well as a nucleic acid to serve as a reference,corresponding to β-actin in this illustration.

Feature 602 illustrates a standardized mixture of internal standardscomprising competitive templates for the reference nucleic acid (β-actinstandard) as well as competitive templates for target nucleic acids(Genes 1 to 6-n standards). In some embodiments, the number ofcompetitive template(s) can be at least one other competitive templatein addition to a target nucleic acid, at least about 100, at least about200, at least about 500, at least about 1,000, at least about 5,000, atleast about 10,000, at least about 50,000, or at least about 100,000other competitive templates. For example, competitive templates forseveral genes to be-measured can be included in a given standardizedmixture, as illustrated in feature 602.

Feature 603 (vertical two-way arrows) illustrates a relationship amonginternal standards within a standardized mixture. A competitive templatefor each of a number of genes can be at a fixed concentration relativeto other competitive templates within a standardized mixture.Accordingly, in some embodiments, when a cDNA sample is combined with astandardized mixture, the concentration of each competitive template isfixed relative to the cDNA representing its corresponding gene.

Feature 604 (horizontal two-way arrows) illustrate a relationshipbetween an internal standard and its corresponding cDNA from a sampleand how each target nucleic acid is measured relative to its respectivecompetitive template in the standardized mixture. Because thecompetitive template for each of these nucleic acids is present at afixed concentration relative to other competitive templates, thestandardized mixture can allow a target nucleic acid to be assessedrelative other nucleic acids being measured with the standardizedmixture 602. For example, Sample A 601 can be combined with standardizedmixture 602, e.g., to form a master mixture used for furtherco-amplifications. For example, the master mixture can be used inco-amplifying nucleic acid corresponding to Gene 1 and its competitivetemplate (Gene 1 standard), as well as co-amplifying nucleic acidcorresponding to Gene 2 and its competitive template (Gene 2 standard).

In a two-step approach using standardized mixture 602, a target nucleicacid and its respective competitive template can be co-amplified toproduce first amplified product thereof; The amplified products can bediluted and further co-amplified one or more times, as described in moredetail above. In some embodiments, first amplified product of thereference nucleic acid can be diluted and further amplified one or moretimes, also as described above.

Feature 606 illustrates a number of other samples, Samples B_(1-n) 605,which also comprise nucleic acids, corresponding to Genes 1 to 6-n, anda reference nucleic acid, corresponding to β-actin. In some embodiments,the number of β-actin mRNA molecules obtained from a cell may vary fromabout 100 to about 1000, e.g., depending on efficiency of RNAextraction, the size and/or other characteristics of the cell.

In some embodiments, another nucleic acid can serve as a secondreference nucleic acid. For example, in some embodiments, geneexpression measured in reference to β-actin mRNA can be re-calculatedrelative to that of another reference nucleic acid, if so desired. Forexample, if another nucleic acid, e.g. GAPDH or any other of Genes 1 to6-n 602, appears to vary less than β-actin across the samples B_(1-n)605, the data may be re-calculated (“normalized”) to that referencewithout altering the relative expression measurement, e.g., the relativeexpression measurement within a sample. When nucleic acid measurementdata are re-calculated, the relative measured amounts among nucleicacids can remain the same or substantially the same.

FIG. 7 illustrates a re-calculation using cyclophilin as a secondreference gene, where gene expression is provided as a ratio of (targetgene NT molecules)/(10⁶ β-actin NT molecules). In FIG. 7, NT refers tonative template, and the target gene is c-myc.

Ratio 701 illustrates a gene expression value for the target gene as theratio of (c-myc NT molecules)/(10⁶ β-actin NT molecules). Ratio 702illustrates a gene expression value for a first reference gene as theratio of (cyclophilin NT molecules)/(10⁶ β-actin NT molecules). Ratio703 illustrates a conversion factor for re-calculating relative tocyclophilin. Ratio 703 provides the inverse of ratio 702, namely of (10⁶β-actin NT molecules)/(cyclophilin NT molecules). Conversion can beachieved by multiplying ratio 701 by the ratio 703 to provide ratio 704.Ratio 704 illustrates the ratio (c-myc NT moleclues)/(cyclophilin NTmolecules), a gene expression value for the target gene relative to thenew reference gene. 100124] In other embodiments, conversion from(molecules of target nucleic acid)/(molecules of a first referencenucleic acid) to (molecules of target nucleic acid)/(molecules of asecond reference nucleic acid) can be achieved, e.g., by inverting agene expression value of the second reference, e.g., to (molecules offirst reference nucleic acid)/(molecules of second reference gene) andmultiplying this factor by the data. The value for molecules of thefirst reference nucleic acid can cancel out, leaving the secondreference gene in the denominator.

Re-calculation may be accomplished using a spreadsheet, in someembodiments. In some cases, re-calculating relative to a new referencecan alter the numerical value of a measured amount of a given nucleicacid without altering the numerical values of nucleic acids relative toeach other. Without being limited to a particular hypothesis and/ortheory, this may be explained in that measured amounts of a nucleic acidcan be said to be linked through use of a common standardized mixture ofcompetitive templates 602. Thus, the ratio between two nucleic acidswithin a sample would be the same or substantially the same usingβ-actin, cyclophilin, or a combination of nucleic acids as the referencenucleic acid.

Feature 605 (two way arrows) illustrates how each of these nucleic acidsin additional samples can be measured relative to its respectivecompetitive template in the standardized mixture 602. As with Sample A601, each of these nucleic acids can be assessed relative other nucleicacids measured with the standardized mixture 602. Further, it ispossible to compare data from analysis of Sample A 601 to data fromanalysis of samples B_(1-n) 604. For example, because the number ofmolecules for each competitive template is known within the standardizedmixture, it is possible to calculate all data in the form ofmolecules/reference nucleic acid molecules. In some embodiments, thestandardized mixture 602 comprises sufficient amounts of competitivetemplates for assessing one or more of the target nucleic acids in alarge number of samples B_(1-n) 604, e.g., in more than about 10⁴samples, in more than about 10⁵ samples, in more than about 10⁶ samples,in more than about 10⁷ samples, in more than about 10⁸ samples; in morethan about 10⁹ samples, in more than about 10¹⁰ samples, in more thanabout 10¹¹ samples, in more than about 10¹² samples, in more than about10¹³ samples, in more than about 10¹⁴ samples, or in more than about10¹⁵ samples. In some preferred embodiments, use of a commonstandardized mixture for multiple samples can reduce time to obtainnucleic acid measurements. For example, re-preparing reagents for PCRreactions can be time consuming and can also lead to sources of error.

A nucleic acid and its competitive template may be co-amplified (and/orfurther co-amplified) in the same or different vessels as one or moreother nucleic acid and corresponding competitive template. See, e.g.,Apostolakos, M. J., Schuermann, W. H., Frampton, M. W., Utell, M. J.,and Willey, J. C. (1993) Measurement of gene expression by multiplexcompetitive polymerase chain reaction. Anal. Biochem. 213, 277-284;Willey, J. C., Crawford, E. L., and Jackson, C. M. (1998) Expressionmeasurement of many genes simultaneously by quantitative RT-PCR usingstandardized mixtures of competitive templates. Am. J. Respir. Cell Mol.Biol. 19, 6-17. The vessel used may be any object capable of allowing areaction mixture to exist therein and/or thereon. For example, thevessel may comprise a well, tube, nano and/or microfluidic reservoirand/or channel, capillary, groove, surface, and/or other container.

In some preferred embodiments, use of a standardized mixture 602 allowsdifferent nucleic acids amplified in separate vessels to be directlycompared. In some embodiments, for example, one nucleic acid and itscompetitive template are co-amplified in one vessel, while anothernucleic acid and its competitive template are co-amplified in adifferent vessel. In either case, as feature 603 illustrates, nucleicacid can be measured relative to its respective internal standardcompetitive template within the standardized mixture and the othernucleic acid can serve as a reference nucleic acid. That is, inpreferred embodiments, the use of a standardized mixture allows theconcentration of internal standard for a nucleic acid relative to othersto remain fixed across different measurements.

As feature 603 illustrates, use of a common standardized mixture allowsdirect comparisons to be made among Samples B_(1-n) 604. The differentsamples may be amplified at different times, e.g., on different days; inthe same or different experiments in the same laboratory; and/or indifferent experiments in different laboratories. Crawford, E. L.,Peters, G. J., Noordhuis, P., et al. (2001) Reproducible gene expressionmeasurement among multiple laboratories obtained in a blinded studyusing standardized RT (StaRT)-PCR. Mol. Diagn. 6, 217-225; Crawford, E.L., Warner, K. A., Khuder, S. A., et al. (2002) Multiplex standardizedRT-PCR for expression analysis of many genes in small samples. Biochem,Biophys. Res. Commun. 293, 509-516; Crawford, E. L., Khuder, S. A.,Durham, S. J., et al. (2000) Normal bronchial epithelial cell expressionof glutathione transferase P1, glutathione transferase M3, andglutathione peroxidase is low in subjects with bronchogenic carcinoma.Cancer Res. 60, 1609-1618; DeMuth, J. P., Jackson, C. M., Weaver, D. A.,et al. (1998) The gene expression index c-myc×E2F1/p21 is highlypredictive of malignant phenotype in human bronchial epithelial cells.Am. J. Respir. Cell. Mol. Biol. 19, 18-24; Mollerup, S., Ryberg, D.,Hewer, A., Phillips, D. H., and Haugen, A. (1999) Sex differences inlung CYPIAI expression and DNA adduct levels among lung cancer patients.Cancer Res. 59, 3317-3320; Rots, M. G., Willey, J. C., Jansen, G., etal. (2000) mRNA expression levels of methotrexate resistance-relatedproteins in childhood leukemia as determined by a standardizedcompetitive template-based RT-PCR method. Leukemia 14, 2166-2175; Rots,M. G., Pieters, R., Peters, G. J., et al. (1999) Circumvention ofmethotrexate resistance in childhood leukemia subtypes by rationallydesigned antifolates. Blood 94, 3121-3128; Allen, J. T., Knight, R. A.,Bloor, C. A., and Spiteri, M. A. (1999) Enhanced insulin-like growthfactor binding protein-related protein 2 (connective tissue growthfactor) expression in patients with idiopathic pulmonary fibrosis andpulmonary sarcoidosis. Am. J. Respir. Cell. Mol. Biol. 21, 693-700;Loitsch, S. M., Kippenberger, S., Dauletbaev, N., Wagner, T. O., andBargon, J. (1999) Reverse transcription-competitive multiplex PCRimproves quantification of mRNA in clinical samples application to thelow abundance CFTR mRNA. Clin. Chem. 45, 619-624; Vondracek, M. T.,Weaver, D. A., Sarang, Z., et al. (2002) Transcript profiling of enzymesinvolved in detoxification of xenobiotics and reactive oxygen in humannormal and Simian virus 40 T antigen-immortalized oral keratinocytes.In. J. Cancer 99, 776-782. In preferred embodiments, measurements aremade using the same standardized mixture and dilution of internalstandard competitive templates.

Further, in some embodiments, measurements obtained using variousquantifying approaches are directly comparable where a commonstandardized mixture is used. For example, statistically similar resultswere obtained using a common standardized mixture and quantifyingamplified product by various types of electrophoresis, or by either aCaliper AMS 90 SE30 electrophoretic separation or by hybridizing them tomicroarrays in accordance with some embodiments of the instantinvention. In another example, reproducible gene expression measurementswere obtained when amplified product was quantitated using MALDI-TOF MSinstead of using electrophoresis. Ding C. and Cantor, C. R. (2003) Ahigh-throughput gene expression analysis technique using competitive PCRand matrix-assisted laser desorption ionization time-of-flight MS. Proc.Natl. Acad. Sci. USA 100, 3059-3064.

The use of the standardized mixtures may also be applied to othermethods for measuring nucleic acids, e.g., in real-time RT-PCR. Forexample, in some embodiments, obtaining a ratio of amplified product ofa nucleic acid to amplified product of a competitive template for thenucleic acid can comprise a use of real-time RT-PCR analyses. As anotherexample, a standardized mixture may be used in accordance with someembodiments of the instant invention in combination with competitivetemplate techniques described, e.g., in Siebert, P. D., et al., Nature359:557-558 (1992); Siebert, P. D., et al., BioTechniques 14:244-249(1993), and Clontech Brochure, 1993, Reverse Transcriptase-PCR (RT-PCR).For example, fluorescent probes for using a standardized mixture withreal-time RT-PCR may be developed.

C. Use of Serially-Diluted Standardized Mixtures

In some embodiments, a series of serially-diluted standardized mixturesis used to assess amounts of nucleic acid. “Serially-dilutedstandardized mixtures” can refer to two or more standardized mixtures inwhich one or more of the reagents in the standardized mixtures isserially-diluted. In some embodiments, one or more reagents in thestandardized mixtures is serially-diluted relative to a different one ormore of the reagents in the mixtures. For example, in preferredembodiments, a competitive template for a first nucleic acid is seriallydiluted relative to a competitive template for a second nucleic acidwhere the second nucleic acid can act as a reference nucleic acid. Insome embodiments, the reference nucleic acid can be present at twodifferent concentrations in two of the serially-diluted standardizedmixtures. One of a series of serially-diluted mixtures is also referredto herein as a “Mix.”

FIG. 8 illustrates use of a series of standardized mixtures, accordingto some embodiments of the instant invention. In the figure, “SMIS”refers to a standardized mixture of internal standards, prepared inaccordance with embodiments of the instant invention.

Feature 801 illustrates a sample, Sample A, which comprises a number ofnucleic acids to be measured, corresponding to Genes 1-12, as well as anucleic acid that serves as a reference, corresponding to β-actin inthis illustration.

Feature 802 illustrates a series of six standardized mixtures, MixesA-F, comprising 10-fold dilutions of competitive templates for differentgenes relative to competitive templates for a reference gene, β-actin inthis illustration.

Feature 803 illustrates the relationship between competitive templatesfor the reference nucleic acid (β-actin standard) compared tocompetitive templates for target nucleic acids (Genes 1 to 12 standards)in the different serially-diluted mixtures. Use of the series can allowmeasurement of the nucleic acids corresponding to different genesexpressed over a range, e.g., a range of more than six orders ofmagnitude.

Feature 804 (two way arrows) illustrates how these different nucleicacids in the Sample 801 are in balance with (i.e., calibrated to)different concentrations of their corresponding competitive templates inthe different mixes. “Balancing” or being in balance with, as usedherein, can refer to calibrating amounts of two nucleic acids. Forexample, Genes 9 and 10 in Sample A 801, expressed at a low level, arein balance with Mix E comprising 600 molecules/ul of competitivetemplate for gene 9 and Gene 10. Genes 9 and 10 are preferably measuredusing Mix E. Genes 6 and 7 are expressed at a higher level in Sample A801 and are in balance with Mix C and Mix D, respectively. Gene 6 ispreferably measured using Mix C and Gene 7 is preferably measured Mix D.

In some embodiments, use of a series allows measurement of nucleic acidsover a range of concentrations. Where practice of the invention assessesgene expression, as in FIG. 8, some embodiments allow measurement overone or more orders of magnitude of gene expression. For example, in someembodiments, the amounts of two nucleic acids to be measured vary over arange of less than about one order of magnitude, more than about oneorder of magnitude, or more than about 2 orders of magnitude. In someembodiments, the amounts of two different nucleic acids to be measured,e.g., mRNA levels expressed from two or more different genes, vary overa range of about 3 or more orders of magnitude, about 4 or more ordersof magnitude, about 5 or more orders of magnitude, about 6 or moreorders of magnitude, or about 7 or more orders of magnitude, e.g.,spanning an about 7-log range of gene expression including about 10⁻¹,about 10⁻², about 0.1, about 1, about 10, about 10², about 10³, andabout 10⁴ copies/cell. In some embodiments, the amounts of two differentnucleic acids to be measured vary over a range of about 8 or more, about9 or more, or about 10 or more orders of magnitude, e.g., spanning anabout 10-log range of gene expression of about 10⁻³, about 10⁻², about0.1, about 1, about 10, about 10², about 10³, about 10⁴, about 10⁵, orabout 10⁶ copies/cell. Such ranges of gene expression may be importantin detecting agents of biological warfare, for example.

Feature 805 illustrates a different sample, Sample B, also comprisingnucleic acids corresponding to Genes 1-12 and to β-actin.

Feature 806 (two way arrows) illustrates the different nucleic acids inthe Sample B 805 are also in balance with different concentrations oftheir corresponding competitive templates in the different mixes. Agiven gene in a different sample can be in balance with the same Mix,allowing past experience with a measuring a given gene to inform theselection of an appropriate Mix.

In some embodiments, the series can comprise serial 10-fold dilutionfrom a standardized mixture comprising competitive template for more orless than the 12 genes of FIG. 8. For example, a series can be preparedfor a 96-nucleic acid standardized mixture or a standardized mixturecomprising various numbers of nucleic acids as detailed above.

In some embodiments, the method for assessing an amount of a nucleicacid involves providing a series of serially-diluted standardizedmixtures comprising a competitive template for the nucleic acid and acompetitive template for another nucleic acid present in a number ofsamples comprising the nucleic acid, where the competitive templates areat known concentrations relative to each other, combining one of thesamples comprising the nucleic acid with one of the serially-dilutedstandardized mixtures; co-amplifying the nucleic acid and itscompetitive template to produce amplified product thereof; obtaining afirst relationship that compares amplified product of the nucleic acidto amplified product of its competitive template; determining whetherthe relationship corresponds to a ratio within about 1:10 to about 10:1;and if not, repeating combining, co-amplifying, obtaining anddetermining steps using a second one of the serially-dilutedstandardized mixtures. Further, in some embodiments, the other nucleicacid and its competitive template can be co-amplified to produceamplified product thereof; a second relationship obtained that comparesamplified product of the other nucleic acid to its competitive template;and comparing first and second relationships.

In some embodiments, a “two-step” approach may be used. For example, insome embodiments, the method further comprises diluting amplifiedproduct of nucleic acid and its corresponding competitive template; andfurther co-amplifying the diluted amplified product to produce furtheramplified product thereof.

In some embodiments, different concentrations of competitive templatesfor reference nucleic acid may be used. For example, where theexpression of a first reference nucleic acid varies in comparison to asecond reference nucleic acid, use of more than one concentration can behelpful in determining inter-sample and/or inter-specimen variation inexpression of corresponding reference genes. For example, someembodiments use two different concentrations of GAPD competitivetemplates, as the expression of GAPD relative to β-actin may vary asmuch as about a 100-fold from one tissue type to another. Having twodifferent concentrations of GAPD competitive template relative to thatfor β-actin, can enable better comparison of GAPD to β-actin in varioussamples.

FIG. 9 illustrates how, in some embodiments, nucleic acid serving as areference can be used to balance a sample with a standardized mixture ofthe series of serially-diluted standardized mixtures.

Step 901 illustrates quantitative balancing of a nucleic acid sample.Qualitative balancing, as used herein, can also be referred to asqualitative calibration. The nucleic acid sample can be diluted providea series of serially-diluted samples and one of the series selected, forcombining with standardized mixture, depending on the concentration ofthe reference nucleic acid in the dilution. For example, at step 901,cDNA material is serially-diluted to provide a series of samples havingserial dilutions of β-actin nucleic acid.

Step 902 illustrates that a dilution is selected to provide aboutequivalent β-actin native template (NT) molecules as there are β-actincompetitive template (CT) molecules in a SMIS Mix. In some embodiments,a specimen can be diluted until any one (or more) of the nucleic acidsis approximately balanced with, i.e., approximately calibrated to, theamount of competitive template for that nucleic acid in the standardizemixture. Thus, in preferred embodiments, the first one of the number ofsamples to be combined with standardized mixture is selected to providereference nucleic acid calibrated or approximately calibrated to itscompetitive template in the standardized mixture. Approximatecalibration can occur when the nucleic acid is within about a 10-foldrange, a 9-fold range, an 8-fold range, a 7-fold range, a 6-fold range,a 5-fold range, a 4-fold range, a 3-fold range, a 2-fold range, or a1-fold range or less, of the concentration of the competitive templatefor that particular nucleic acid in the standardized mixture. Inpreferred embodiments, the NT/CT ratio for the reference nucleic acid isbetween about 1:10 and about 10:1 (e.g., for measurement to be withinlinear dynamic range).

FIG. 10 further illustrates selection of a cDNA dilution that provides areference nucleic acid (β-actin in this illustration) in balance with600,000 molecules of the reference nucleic acid competitive template inthe standardized mixture, e.g., so the nucleic acid can compete equally(or substantially equally) with the 600,000 competitive templatemolecules. In preferred embodiments, all standardized mixtures in agiven series contain a given number of molecules of a particularreference nucleic acid, allowing any of the standardized mixtures to beused in balancing. For example, A-F can each contain about 10⁻¹² Mβ-actin competitive template so than any of Mixes A-F can be used inbalancing with a cDNA sample. Typically, Mix F is used for balancingβ-actin cDNA in a sample.

FIG. 11 illustrate a series of serially-diluted standardized mixturescomprising one or more mixes where 1 μL contains 600,000 molecules ofβ-actin competitive template, corresponding to 1 μL of a standardizedmixture containing 10⁻¹² M β-actin competitive template. In that case,for example, cDNA material can be diluted until 1 μL is calibrated to600,000 molecules of β-actin competitive template. Typically, this isthe amount of cDNA derived from 100 to 1,000 cells in the case ofβ-actin. Although the number of β-actin mRNA copies/cell varies from onecell to another, using a conservative estimate of 600 β-actin mRNAcopies/cell and assuming a reverse transcription efficiency of 10%, acDNA sample containing 600,000 molecules of β-actin cDNA can be derivedfrom 1,000 cells.

This amount may be used to provide sufficient cDNA to quantify genesexpressed at low levels, e.g., genes expressed in low copy number, e.g.,at about 0.1 copy/cell, 0.05 copies/cell, and/or 0.01 copies/cell. Withreference cDNA in balance with about 10⁻¹² M -actin in the PCR reaction,some embodiments can quantify sample nucleic acid that is in balancewith about 10⁻¹⁶ M or less of its CT. In some specific embodiments,where reference cDNA is in balance with about 10⁻¹² M β-actin in a 10 μlPCR reaction volume, there can be about 600,000 molecules of β-actin NTand about 600,000 molecules of β-actin CT in the reaction, and thenumber of molecules of sample nucleic acid in balance with about 10⁻¹⁶ Mor about 10⁻¹⁷ M of its CT can be about 60 or about 6 respectively.About 60 or about 6 molecules of nucleic acid can translate into about0.1 to about 0.01 molecules/cell.

This balancing can provide at least about 10 copies present at thebeginning of amplification, avoiding, e.g., stoichiometric problems. Insome embodiments where less sensitivity is sought, less cDNA may beused. For example, in some embodiments, an amount of cDNA approximatelyin balance with 60,000 molecules of β-actin CT can be used, allowingreduced consumption of cDNA, e.g., by about 10-fold.

A first one of the serially-diluted standardized mixtures can beselected for combing with the nucleic acid sample. FIG. 12 illustratesthat Mix E can be used initially, based on the expression levels of mostgene. There appears to be a stoichiometric and/or stochasticdistribution of expression among genes (see, e.g., Kuznetsova, et al.,General Statistics of Stochastic Process of Gene Expression inEukaryotic Cells, Genetics, Vol. 161, 1321-1332, July 2002), with a meanapproximately 2 orders of magnitude lower than the expression forβ-actin, e.g., in human bronchial epithelial cells. Without beinglimited to a given theory and/or hypothesis, the distribution of geneexpression levels in cells indicates that mRNA transcripts of many geneswill be balanced with Mix E, in some embodiments.

FIG. 12 further illustrates that the use of a series of serially-dilutedstandardized mixtures of some embodiments can allow gene expressionmeasurement over a full spectrum observed. As FIG. 12 illustratesthrough color-coding, different Mixes can be used to measure genesexpressed at different levels with good reproducibility. Because thereare about 100 to about 1,000 β-actin copies/cell for most cell types,this level of sensitivity allows measurement of molecule per about 100to about 1,000 cells. At the other end of the expression spectrum, astandardized mixture comprising greater concentrations of competitivetemplates can allow measurement of more highly expressed genes. Forexample, Mix A in some embodiments, can allow measurement of more than10⁷ molecules/10⁶ molecules of β-actin (about 1,000 to about 10,000copies/cell). Examples of genes expressed at these levels, include UGB(Genbank no. U01101) and vimentin (X56134).

In other embodiments, a different mix may be used initially based onpast experience and/or prediction of the amounts of nucleic acidexpected. For example, Mix A, Mix B, Mix C, Mix E, or Mix F may be usedinitially. In preferred embodiments, the mixture selected is onecontaining a concentration of competitive template likely to beapproximately calibrated with (e.g., within about a 10-fold range) thegene or genes being assessed. In preferred embodiments, an appropriatestandardized mixture can be selected based on data in some embodimentsof standardized expression databases described herein.

After combining a sample comprising a nucleic acid to be measured withone of the series of serially-diluted standardized mixture, the nucleicacid and its competitive template can be co-amplified, e.g., asdescribed in detail above. Also as described above, a ratio can beobtained comparing amount of amplified product of the nucleic acid toamount of amplified product of its corresponding competitive template.Although a reference nucleic acid in the sample was balanced with itscompetitive template in the Mix, the target nucleic acid may not bebalanced. Where the amounts of amplified product of a target nucleicacid and of its competitive template differ greatly, theco-amplification may be repeated using a different Mix of the series ofserially-diluted mixtures. That is, a second and/or subsequentserially-diluted standardized mixture can be selected for combing withthe nucleic acid sample.

FIG. 13 illustrates a situation where the initial Mix did not providecompetitive template for target nucleic acid sufficiently in balancewith the amount of target nucleic acid in the cDNA dilution. The targetnucleic acid in this illustration corresponds to c-myc; IS refers to aninternal standard competitive template. As FIG. 12 illustrates,amplified product of c-myc NT is not within about a 10-fold amount ofamplified product of c-myc CT. In some embodiments, software determinesarea under curve for the NT and CT and calculates the ratio of NT/CT forthe target nucleic acid.

In preferred embodiments, the next Mix selected from the series is basedon the ratio obtained when amplified product of the target nucleic acidis compared to amplified product of its competitive template. Forexample, where the ratio is less than about 1/10, a more dilute mixturefrom the series will be used subsequently; where the NT/CT ratio is morethan about 10/1, a more concentrated mixture from the series will beused. FIG. 12 illustrates the situation where a large ratio is obtained,indicating that a more concentrated Mix should be used next, e.g., MixC. In some embodiments, software can be used to automatically determinewhich Mix should be selected next.

FIG. 14 further illustrates selection of Mix C. The NT/CT ratio obtainedfor the target nucleic acid (c-myc in this illustration) is plotted on agraph. Position on the graph can indicate which Mix should be used fornucleic acid expressed at that level. In some embodiments, described inmore detail below, software automatically communicates the correct Mixto be used to a robot.

Another sample of the nucleic acid, e.g., at the same cDNA dilution, canthen be combined with the subsequently-selected serially-dilutedstandardized mixture. After combining, the nucleic acid and itscompetitive template can be co-amplified, e.g., as described in detailabove. Also as described above, a ratio can be obtained comparing amountof amplified product of the nucleic acid to amount of amplified productof its corresponding competitive template.

FIG. 15 illustrates the situation where the next Mix selected doesprovide competitive template for target nucleic acid sufficiently inbalance with the amount of target nucleic acid in the cDNA dilution. AsFIG. 15 illustrates, amplified product of c-myc NT is within about a10-fold amount of amplified product of c-myc CT. In some embodiments,software determines area under curve for the NT and CT and calculatesthe ratio of NT/CT for the target nucleic acid. In some embodiments,software can also compare this ratio with the NT/CT ratio for thenucleic acid serving as a reference.

In preferred embodiments, the amount of sample cDNA can be kept constantwhile a different standardized mixture is used. As another example, ifMix D were used and the amount of amplified product of the NT was morethan 10-fold greater than that of the corresponding CT, the experimentcan be repeated with the same starting amount of cDNA, but using Mix C,which has about a 10-fold higher concentration of the competitivetemplate, or Mix A or Mix B. Where the amount of amplified product ofthe is less than 10-fold lower than that of the corresponding CT, theexperiment can be repeated with the same starting amount of cDNA, butusing Mix E or Mix F. The more dilute mixture and/or the moreconcentrated mixture selected may be next more dilute and/or moreconcentrated mixture in the series or a different serially-dilutedmixture in the series, e.g., depending on the magnitude of the ratioobtained.

A highly preferred embodiment, in terms of cDNA consumption and reducedcost, involves using 1 μl of balanced cDNA in round one of a two-stepprocess with each of the six (A-F) competitive template mixes; using 10nanoliters of the round one amplified product in parallel 100 nanolitervolume round two amplifications to measure amounts of all of the 96nucleic acids using Mix E (which contains competitive templates at aconcentration that will be in balance with the majority of genes); andthen repeating the above steps for nucleic acids that are not in balancewith Mix E using the appropriate mix.

When an appropriate mix is used, amount of target nucleic acid can beassessed, in accordance with methods described herein. FIG. 16illustrates calculation of a “ratio of ratios” based on data obtainedusing an appropriate Mix.

FIG. 17 illustrates a series of electropherograms, e.g., as can beobtained in preferred embodiments where multiple nucleic acids areassessed together. Addition details regarding the practice of varioussteps outlines above are provided in the Example II below.

As indicated above, in some embodiments, the method for assessingnucleic acids using a series of serially-diluted standardized mixturesis computer implemented. FIG. 18 schematically illustrates an overallsystem for assessing nucleic acids, one or more steps of which may becomputer implemented in various embodiments.

At step (a) a software program can determine a desired concentration ofcompetitive template reagents to be used. This step can compriseselecting a sample dilution and/or selecting a Mix of a series ofserially-diluted mixtures for combing. For example, computerimplementation may comprise instructing a robotic handler to select afirst one of the serially-diluted standardized mixtures for combining,e.g., Mix E as detailed above.

At step (b) a software program can cause at least one reagent to bedispensed into one or more vessels, in which the amplification reactionsare to be conducted; and amplified product can be directed to a suitabledevice for separating, identifying and/or labeling, e.g., by flowing toa microfluidic capillary electrophoresis (CE) machine. In someembodiments, this step may comprise instructing a robotic handler todispense a selected Mix and/or sample dilution in a vessel,co-amplifying nucleic acids and their corresponding competitivetemplates, and separating amplified product.

At step (c), information regarding the separated amplified products canbe analyzed. For example, step (c) may comprise obtaining a relationshipcomparing amplified product of a nucleic acid to amplified product ofits competitive template. For example, after sufficient gelelectrophoresis, gels can be digitally imaged automatically, and theimage automatically analyzed to assess amounts of amplified product,e.g., by automatically determining area under the curves. For example,software can determine area under the curves for the NT and CT of agiven nucleic acid and calculate the ratio of NT/CT.

In some embodiments, calculation steps are incorporated into aspreadsheet. For example, in some embodiments, a user can enter rawvalues (e.g., for peak heights or area under the curve) for the NT, CT;and heterodimer PCR products for a given gene to be measured into aspreadsheet, and the expression value for the gene can be automaticallycalculated. In some embodiments, software can be used to automaticallyenter values for NT and CT amplified product for each of one or morenucleic acids to be measured into a spreadsheet to automaticallycalculate a numerical value, e.g., a numerical value corresponding togene expression

Information from step (c) can be provided in a “Report”, sent to a“Database” and/or sent to step (d), which can reiterate the process forfurther analysis of data received. For example, if the calculated ratiois not within a desired range (for example, within about a 1:10 to abouta 10:1 ratio) as described above, a new desired concentration ofcompetitive template reagents (i.e., different from the originalconcentrations selected to step (a)) may be chosen and the steps (b)-(c)are repeated. In some embodiments, software can be used to automaticallydetermine which Mix should be selected next, based on considerationsdescribed above. In some embodiments, a software program can instruct arobotic handler to combine a sample with the new Mix.

Another aspect of the present invention is directed to a computerprogram for implementing certain embodiments of methods of the instantinvention. In certain embodiments, the computer program includes acomputer readable medium and instructions, stored on the computerreadable medium. In preferred embodiments, the instructions include oneor more steps recited above. The computer program can further includeinstructions for dispensing amplified product into arrays formeasurement, as well as instructions for fluorescently labelingamplified product and/or nucleic acid to which they hybridize. Amplifiedproduct may be labeled, e.g., by labeling one or more nucleotides in theamplification reaction with a detectable moiety, e.g. a fluorescentmoiety. The computer program can further include instructions formeasuring amounts of nucleic acid, e.g., by comparing fluorescentintensities of the arrays for the amplified product of a given nucleicacid and its competitive template.

D. Substantially Constant Relationship

Some embodiments of the present invention described above provide arelationship for assessing nucleic acid where the relationship remainsconstant or substantially constant beyond the exponential phase ofamplification. In nucleic acid amplifications, e.g., PCR, the amount ofamplified product can cease to increase exponentially after anindefinite number of cycles. For example, at some point and foruncertain reasons, the amplification reaction can become limited and theamount of amplified product can increase at an unknown and/ornon-exponential rate. For example, PCR amplification rate can be low inearly cycles when the concentration of the templates is low. After anunpredictable number of cycles, the reaction can enter a log-linearamplification phase. In late cycles, the rate of amplification can slowas the concentration of PCR products becomes higher, e.g., high enoughto compete with primers for binding to templates. The yield of amplifiedproduct in PCR reactions, for example, has been reported to vary by asmuch as 6-fold between identical samples run simultaneously. Gilliland,G., et al., Proc. Natl. Acad. Sci. 87:2725-2729, 1990. PCR techniquesare generally described in U.S. Pat. Nos. 4,683,195; 4,683,202; and4,965,188. Other investigators have analyzed samples amplified for anumber of cycles known to provide exponential amplification (Horikoshi,T., et al., Cancer Res. 52:10⁸-116 (1992); Noonan, K. E., et al., Proc.Natl. Acad. Sci. 87:7160-7164 (1990); Murphy, L. D., et al.,Biochemistry 29:10351-10356 (1990); Carre, P. C., et al., J. Clin.Invest. 88:1802-1810 (1991); Chelly, J., et al., Eur. J. Biochem187:691-698 (1990); Abbs, S., et al., J. Med. Genet. 29:191-196 (1992);Feldman, A. M. et al., Circulation 83:1866-1872 (1991). Some embodimentsof the instant invention allow quantification of PCR amplification atany phase in the PCR process, including the plateau phase.

Some embodiments of the present invention relate to obtaining arelationship constant (or substantially constant) beyond the exponentialphase of nucleic acid amplification, thereby allowing the initial amountof a nucleic acid to be determined by extrapolation from end pointamounts of amplified product. In some embodiments, the exponential phasefor amplifying the nucleic acid need not be defined for each set ofexperimental conditions, saving time and materials. For example, someembodiments do not involve real-time measurements. Some embodiments donot involve generation of a standard curve, and/or generation ofmultiple standard curves, e.g., where the standard curve is used todetermine an exponential range of amplification for a given nucleic acidto be measured and/or where the standard curves compare measured amountsof one nucleic acid to another.

FIG. 19 and 20 illustrate how certain embodiments of the presentinvention can provide a constant or substantially constant relationshipusing end-point measurements beyond the exponential phase ofamplification. Data shown in the graphs were obtained as detailed inExample III below. FIG. 19 illustrates that the amount of amplifiedproduct vs. total starting amount of RNA does not remain linear withincreasing amounts of RNA, e.g., beyond the exponential phase ofamplification. That is, there is a non-linear relationship betweenamount of amplified product (empty boxes: glutathione peroxidase GSH-Px;solid boxes: glyceraldehyde-3-phosphate dehydrogenase GAPDH) and totalstarting amount of RNA for increasing amounts of RNA, e.g., beyond theexponential phase of amplification. Straight lines represent theoreticalamounts of PCR product (either GSH-Px or GAPDH) that would be obtainedif amplification remained exponential throughout the amplificationprocess

FIG. 20 illustrates that a linear relationship can be obtained where theratio of (amplified product of nucleic acid/co-amplified product of itscompetitive template) is plotted against total starting amount of RNAfor first and second nucleic acids corresponding to GSH-Px (empty boxes)and GAPDH (solid boxes), respectively. FIG. 20 illustrates two linear(or substantially linear) relationships for two different nucleic acids,each co-amplified with its respective competitive template, withr₂=0.982 for GSH-Px and r₂=0.973 for GAPDH for the range of total RNAstudied.

FIG. 21 illustrates that the relationship of (amplified product of firstnucleic acid/co-amplified product of its competitivetemplate)/(amplified product of second nucleic acid/co-amplified productof its competitive template) to total starting amount of RNA remainsconstant, or substantially constant, for the two different nucleic acidswhen amplified in accordance with various embodiments of the instantinvention. Accordingly, some embodiments of the instant invention use arelationship that compares at least two relationships for at least twonucleic acids in a sample, namely, a first relationship comparingamplified product of a first nucleic acid to co-amplified product of acompetitive template for the first nucleic acid, and a secondrelationship comparing amplified product of a second nucleic acid toco-amplified product of a competitive template for the second nucleicacid. Additional details of some of these co-amplifications are providedabove. In some embodiments, the relationship sought further comparesamplified product of a number of other nucleic acid(s) to co-amplifiedproduct of competitive template(s) for said number of other nucleicacid(s).

E. Sensitivity

Some embodiments of the present invention described above provide arelationship for assessing nucleic acid where the relationship withsensitivity. Sensitivity can be defined as the ability of a procedure toproduce a change in signal for a defined changed in the quantity ofanalyte, i.e., the slope of a calibration curve. Some embodiments of theinstant invention provide a slope greater than about 0.1, greater thanabout 0.2, greater than about 0.5, or greater than about 0.8. Somepreferred embodiments of the instant invention provide a slope of about1/1.

For example, some embodiments of the instant invention provide arelationship capable of detecting less than about a two-fold difference,less than about a one-fold difference, less than about an 80%difference, less than about a 50% difference, less than about a 30%difference, less than about a 20% difference, less than about a 10%difference, less than about a 5% difference, or less than about a 1%difference. Such sensitivities can correspond to identifying smallchanges in gene expression.

In some embodiments one or more of these differences can be detected inabout 1,000 molecules or less of the nucleic acid in the sample, e.g.,in about 800, in about 600, or in about 400 molecules. In someembodiments, one or more of these differences can be detected in about100 molecules or less (e.g., in about 60 molecules), in about 10molecules or less (e.g., in about 6 molecules), or in about 1 moleculeor less of the nucleic acid in a sample. In some embodiments, one ormore of these differences can be detected in less than about 10,000,000,less than about 5,000,000, less than about 1,000,000, less than about500,000, less than about 100,000, less than about 50,000, less thanabout 10,000, less than about 8,000, less than about 6,000, less thanabout 5,000, or less than about 4,000 molecules of the nucleic acid in asample.

Some embodiments, as described above, assess nucleic acids over a rangeof concentrations, e.g.; assessing gene expression over one or moreorders of magnitude of gene expression. In some such embodiments,assessing detects less than about a two-fold difference over the range.In some embodiments, assessing detects less than about a one-folddifference, less than about an 80% difference, less than about a 50%difference, less than about a 30% difference, less than about a 20%difference over said range, less than about a 10% difference, less thanabout a 5% difference, or less than about a 1% difference over therange.

Sensitivities described herein can be achieved by some of theembodiments of the instant invention.

F. Reproducibility

In preferred embodiments, methods of assessing a nucleic acid arereproducible. Some embodiments, for example, provide a coefficient ofvariation of less than about 25% between samples of a nucleic acid. Insome embodiments, the coefficient of variation is less than about 50%,less than about 30%, less than about 25%, less than about 20%, less thanabout 15%, less than about 10%, less than about less than about 5%, orless than about 1% between 2 of more samples of the nucleic acid. Suchcoefficients of variation can be obtained in some embodiments where the2 samples are amplified and/or assessed at different times, e.g., ondifferent days; in the same or different experiments in the samelaboratory; and/or in different experiments in different laboratories;and/or where the samples are obtained from different subjects and/ordifferent species. Preferred embodiments of the present inventionprovide both intra- and inter-laboratory reproducibility (M. T.Vondracek, D. A. Weaver, Z. Sarang et al., Int. J. Cancer 99, 776-782(2002)) that is sufficient to detect less than two-fold differences ingene expression. For example, in some embodiments, inter-laboratorycorrelation of variance was 0.48, e.g., from gene expressionmeasurements using a A549 cDNA sample taken in different laboratories atdifferent times, spanning nearly one year in some embodiments, e.g.,embodiments using micro-channel capillary electrophoreseis, thecorrelation of variance was reduced to 0.26. Additional details of astudy to evaluate reproducibility are provided in Example IV below.

In some embodiments, reproducibility between samples allows for the useof fewer dilution tubes. In some embodiments, a single tube may be used,simplifying procedures and permitting the evaluation of many differentsamples at one time.

In some embodiments, including competitive template internal standardsin a common standardized mixture used in different measurements cancontrol for one or more sources of variation. Sources of variationinclude, e.g., variation from cDNA loading, intra-nucleic acidamplification efficiency, inter-nucleic acid amplification efficiency,inter-specimen amplification efficiency, inter-sample amplificationefficiency, and/or intra-sample amplification efficiency. For example,some embodiments using an Agilent 2100 Bioanalyzer providereproducibility intra-lab CV of less than about 0.25 routinely, with asensitivity comparable to slab gel electrophoresis.

FIG. 22 tabulates a number of sources of variation and control methods.

Variation in cDNA loading may result from variation in pipetting,aliquoting, quantification, and/or reverse transcription. For example,errors may occur when aliquotting RNA material into vessels forperforming reverse transcription. Although reverse transcriptionefficiency can vary from one sample to another, the representation ofone nucleic acid to another in a sample need not vary among differentreverse transcriptions.

For example, the efficiency of reverse transcription can vary from about5 to bout 90% (Simmonds et al, 1990). Variation in reverse transcriptionefficiency, however, may affect different transcripts in the same orsubstantially the same manner (Willey et al, 1998; Loitsch et al, 1999).In one experiment, for example, gene expression was measured in 5different reverse transcriptions of a given sample of RNA from the SW900non-small cell carcinoma cell line. The mean level of expressionobtained was 3,600 molecules/10⁻⁶ β-actin molecules with a CV of 0.26,no greater than if replicate measurements had been made on cDNAresulting from a single reverse transcription. However, if reversetranscription and amplification reactions are carried out in differentvessels, errors may occur when pipetting cDNA from the reversetranscription reaction into individual PCR reaction vessels. That is,without being limited to a particular theory and/or hypothesis, theeffect of variation in reverse transcription can be the same as ifdifferent levels of cDNA were loaded in a PCR reaction. Controlling forcDNA loading can then control variation in reverse transcriptionefficiency.

Variation in intra-nucleic acid amplification efficiency may resultfrom, e.g., cycle-to-cycle variation, e.g., where differentamplification cycles show various early slow, log-linear and/or lateslow plateau phases, as described above. Where gene expression is beingmeasured, intra-nucleic acid amplification efficiency can refer tointra-gene amplification efficiency, i.e., for example, variation inrepeat amplifications of cDNA corresponding to a given gene.

Variation in inter-nucleic acid amplification efficiency can refer tointer-gene amplification efficiency, e.g., where the efficiency at whicha given gene is amplified differs from that at which a different gene isamplified. Such differences may be caused by, e.g., differences in theprimers used for amplifying the different genes measured in the sameand/or different samples. For example, the efficiency of a pair ofprimers, e.g., as defined by lower detection threshold (LDT), may not bepredictable, and may vary more than about 100,000-fold (from less thanabout 10 molecules to about 10⁶ molecules) in some embodiments.

Also, a bad lot (e.g., where degradation of primers and/or competitivetemplates has occurred) or inappropriate concentration of primers wouldcause variation in PCR amplification of one nucleic acid relative toanother. In some embodiments, the concentration of competitive templateis small (e.g., femptomolar range) so that any change in the number ofmolecules present in the reaction may introduce a large source of error.Presence of an inhibitor could alter PCR amplification efficiency of onenucleic acid, e.g., one gene, compared to another.

Variation in inter-specimen amplification efficiency may be caused by,e.g., variable presence of an inhibitor (e.g., an inhibitor of PCR) indifferent specimen. PCR reactions inhibitors, include, e.g., heme.Akane, A., Matsuara, K., Nakamura, H., Takahashi, S., and Kimura, K.(1994) Identification of the heme compound co purified withdeoxyribonucleic acid (DNA) from blood stains, a major inhibitor ofpolymerase chain reaction (PCR) amplification. J. Forensic Sci. 39,362.372; Zhu, Y. H., Lee, H. C., and Zhang, L. (2002) An examination ofheme action in gene expression: Heme and heme deficiency affect theexpression of diverse genes in erythroid K562 and neuronal PC12 cells.DNA Cell Biol. 21, 333 346. Further, amplification efficiency fordifferent genes may be affected to different degrees in differentsamples and/or specimen. Meijerink, J., Mandigers, C., van de Locht, L.,et al. (2001) A novel method to compensate for different amplificationefficiencies between patient DNA samples in quantitative real-time PCR.J. Mol. Diagn. 3, 55-61; Giulietti, A., Overbergh, L., Valckx, D., etal. (2001) An overview of real-time quantitative PCR: applications toquantify cytokine gene expression. Methods 25, 386-401. Such differencesmay result in variation in measuring the same or different nucleic acids(e.g., the same or different genes) in the same or different specimenand/or samples. For example, a given PCR inhibitor may have littleaffect on amplification of a lowly expressed gene, e.g., GSTM3. The samePCR inhibitor may have a larger effect, e.g., a significantly largereffect, on amplification of a more-highly expressed gene, e.g., ERBB2,including, e.g., preventing amplification or reducing amplification tonon-detectable levels.

Variation in inter-sample amplification can refer to inter-reactionvariation or well-to-well variation in repeat measurements of the sameor different nucleic acids (e.g., the same or different genes) in thesame or different samples and/or specimen. Variation in inter-sampleamplification efficiency can result from, for example, variable presenceof an inhibitor (e.g., an inhibitor of PCR) in different reactionvessels, variation in temperature cycling between different region of athemocycler block, variable quality of one or more PCR reagents orvariable concentrations of one or more PCR reagent (e.g., primers).

One or more of these sources of variation can reduce PCR amplificationefficiency in a well to the point where no PCR product can be observedin that well. Some embodiments of the instant invention allow this typeof error to be recognized, for example, embodiments using a standardizedmixture comprising about 10⁻¹⁷ M competitive template for the nucleicacid sought to be amplified. In a 10 μL PCR reaction volume, about 10⁻¹⁷M represents about 60 molecules. With about 60 molecules of internalstandard present in the PCR reaction and components of the PCR reactionfunctioning properly, if a nucleic acid is not present in a sample, theamplified product for the competitive template will be observed, but theamplified product for the nucleic acid will not. This may indicate thatthere was less than about six molecules (about 10-fold less than thenumber of competitive template molecules) of nucleic acid in the sample.On the other hand, if neither amplified product of neither the nucleicacid nor its competitive template is detectable, it can be determinedthat the PCR reaction efficiency was suboptimal.

Variation in intra-sample amplification can refer to intra-reactionvariation, e.g., variable amplification efficiency in a given reactionusing a given sample. Variation in intra-sample amplification efficiencymay result from, e.g., variation in thermocycler efficiency at variouspositions within a thermocycler, and can introduce variation whenmeasuring amounts of the same or different nucleic acids (e.g.,expression of the same or different genes) in the same or differentsamples and/or specimen.

Some embodiments for measuring nucleic acids control for variationcaused by one or more of sources of variation selected from cDNAloading, intra-nucleic acid amplification efficiency, inter-nucleic acidamplification efficiency, inter-specimen amplification efficiency,inter-sample amplification efficiency, and intra-sample amplificationefficiency. For example, in some embodiments, use of a standardizedmixture and/or a series of serially-diluted standardized mixtures canprovide control.

Some preferred embodiments control for one or more sources of variationwithout the use of real-time measurements obtained using kineticanalysis (e.g., real-time PRC measurements). For example, obtaining a“ratio of ratios” in some embodiments does not involve taking real-timemeasurements. Some preferred embodiments control for one or more ofsources of variation without generating one or more standard curve(s).For example, obtaining a “ratio of ratios” in some embodiments does notinvolve generating a standard curve. In more preferred embodiments, oneor more sources of error are controlled for using methods that do notinvolve real-time measurements nor generation of a standard curve. Ineven more preferred embodiments, two or more, three or more, four ormore, five or more or six sources of variation are controlled forwithout real-time measurements nor generation of a standard curve.

FIG. 23 illustrates the control of one or more of these sources of errorin some embodiments compared to real-time RT-PCR in two differentspecimen in four different experiments. In FIG. 23, the nucleic acidsbeing measured are referred to as native template (NT), the competitivetemplate for each is referred to as CT, and the second nucleic acidserves as the reference nucleic acid.

FIG. 23 illustrates amplified product of native template and competitivetemplate for a first and a second nucleic acid that are PCR-amplifiedsimultaneously for indicated number of cycles. The amplified products atendpoint are electrophoretically separated, e.g., in the presence offluorescent intercalating dye, and quantified densitometrically. In theillustrated embodiment, the shorter CT PCR product migrates faster thanthe NT PCR product, and is represented by a CT band below the NT band.As one of skill in the art will understand, if there is more NT productthan CT product, the NT band will emit more fluorescent light; if thereis more CT product than NT product, the CT band will emit morefluorescent light. In real-time, the fluorescent PCR product is measuredat each of the 35 to 40 cycles. FIG. 23 illustrates how the reactionswould look if measured at each cycle in real time and the C_(T) for thereal-time curve is represented by the perpendicular black line.

FIG. 23 a illustrates that the ratio of NT/CT present at the beginningof PCR remains (substantially) constant throughout PCR to endpoint. Asdescribed above, it is not necessary to monitor the amplificationreaction in real-time to ensure that the reaction is in log-linear phasein some embodiments of the instant invention.

FIG. 23 a illustrates an experiment using a first sample of a firstspecimen. In the first sample, there are about equivalent number ofmolecules of the second nucleic acid NT and CT present at the beginningof the PCR reaction (e.g., as described above, where a balanced cDNAdilution is used). Thus, following electrophoresis of the amplifiedproduct of the second nucleic acid, the NT and CT bands are aboutequivalent, and during real-time measurement, the fluorescent intensityfor the NT will be about the same as for the CT. The NT/CT ratio is thesame at an early cycle as it is at a late cycle (endpoint), even thoughthe band intensity for both NT and CT is low at early cycle compared tolate cycle. Similarly, the first nucleic acid NT band and CT band areabout equivalent, and the real-time value for the NT is about the sameas for the CT. The ΔC_(T) between the second and the first nucleic acidin real-time measurements is about 10.

FIG. 23 b further illustrates controls for loading from one sample toanother. In FIG. 23 b, the first specimen is re-analyzed using a lowerstarting amount of nucleic acid, e.g., less cDNA loaded, due to avariation in pipetting, e.g., in aliquoting a second sample of the firstspecimen into a different vessel. The NT/CT ratio for the second nucleicacid is lower. However, because the relative concentration ofcompetitive templates is fixed and the relative representation of eachnucleic acid is fixed, the NT/CT ratio for the first nucleic acid goesdown commensurately. Accordingly, the “ratio of ratios” (odds ratio) ofthe first nucleic acid NT/CT divided by second nucleic acid NT/CTremains the same is in. FIG. 23 a. In this case, the ACT in real-timeanalysis is also unchanged.

FIG. 23 c illustrates controls for loading and variation inamplification efficiency. In FIG. 23 c, the first specimen is againre-analyzed, but with both (1) a larger amount of cDNA loaded due tovariation in pipetting (leading to variation in starting amount ofnative template) and (2) lowered amplification efficiency of the secondnucleic acid, as might be caused by inhibitor in the well that affectsamplification of this nucleic acid more than the other, or inappropriateconcentrations of primers for the second nucleic acid.

FIG. 23 c illustrates that with real-time measurements, this reduces theΔC_(T) from 10 to 6, and the value for the first nucleic acid isinappropriately high. In real-time measurements, the gene selectiveinhibition is associated with a decreased ΔC_(T) and erroneousmeasurement.

In contrast, using certain embodiments described herein, because theamplification efficiency of the NTs for each of the two nucleic acids isaffected the same way as its corresponding CT, the NT/CT ratio isunchanged in FIGS. 23 a and 23 c for either first or second nucleicacid. Also, with the larger amount of cDNA loaded, the first nucleicacid NT/CT ratio and the second nucleic acid NT/CT ratio increasecommensurately. Accordingly, the “ratio of ratios” (odds ratio) of firstnucleic acid NT/CT divided by the second nucleic acid NT/CT stays thesame between FIGS. 23 a and 23 c.

FIG. 23 d further illustrates controls for loading a sample of a secondspecimen, where the first nucleic acid is more highly expressed.Although, the first nucleic acid is expressed at a higher level comparedto the second nucleic acid, real-time measurements give a ΔC_(T) ofabout 7.

In contrast, using certain embodiments of described herein, the ratio ofratios indicates the higher level of expression. As less cDNA is loadedinto the PCR reaction, there are fewer copies of the second nucleic acidNT than CT copies present at the beginning of the PCR reaction comparedwith FIG. 23 a. Throughout real-time measurement, the fluorescence valueof the NT is less than that of the CT and at the end of PCR, the secondnucleic acid NT band is still less than the CT band. However, eventhough less cDNA was loaded into the PCR reaction compared to the firstsample, the first nucleic acid NT band is more dense than the firstnucleic acid CT band due to its higher expression, and the first nucleicacid NT fluorescence value during real-time measurement is higherthroughout PCR. Accordingly, the “ratio of ratios” (odds ratio) of firstnucleic acid NT/CT divided by the second nucleic acid NT/CT provides ahigher value in FIG. 23 d than in FIG. 23 a.

Thus real-time RT-PCR may control for loading by measuring the first andsecond nucleic acids in the same PCR reaction (FIGS. 23 a, 23 b, 23 d).The CT (for each nucleic acid represented by a black line intersectingwith the X axis) for the first and second nucleic acids both could varyfrom one experiment to another, but the ΔC_(T) do not vary. However,real-time does not control for variation in the presence of inhibitors,or the quality of PCR reagents.

II. Methods of Preparing Compositions for Assessing Nucleic Acid

Another aspect of the instant invention relates to methods for preparingcompositions for assessing a nucleic acid in a sample.

A. Preparation of Standardized Mixtures

Some embodiments of the invention provide a method for preparing astandardized mixture of reagents. As used herein, “reagent” can refer toa component used in a mixture, including solvent an/or solute. Forexample, reagents include nucleic acids and/or water, e.g., in the caseof aqueous mixtures. In some embodiments, the standardized mixture ofreagents comprises sufficient amounts of competitive template forassessing amounts of a number of nucleic acids in a number of samples,e.g., more than about 10⁶ samples. In preferred embodiments, thestandardized mixture allows direct comparison of the amounts between atleast 2 of the samples. More preferred embodiments allow directcomparison of amounts assessed in at least about 5 samples, at leastabout 10 samples, at least about 50 samples, at least about 100 samples,at least about 500 samples, at least about 1,000 samples, at least about5,000 samples, at least about 10,000 samples, at least about 50,000samples, at least about 100,000 samples, at least about 500,000 samples,at least about 1,000,000 samples, at least about 5,000,000 samples, orat least about 10,000,000 samples. In some specific embodiments, thestandardized mixture allows direct comparison of amounts assessed in upto an unlimited number of samples.

In some embodiments, the standardized mixture comprises sufficientreagents for assessing amounts of one nucleic acid. In some embodiments,the standardized mixture comprises sufficient reagents for assessingamounts of more than one nucleic acid, e.g., at least about 50, at leastabout 96, at least about 100, at least about 200, at least about 300, atleast about 500, at least about 800, at least about 1,000, or at leastabout 5,000, at least about 10,000, at least about 50,000, or at leastabout 100,000 nucleic acids. In some embodiments, the standardizedmixture comprises sufficient reagents for assessing amounts of less thanabout 100,000, less than about 500,000, or less than about 1,000,000nucleic acids. In preferred embodiments, different nucleic acidscorrespond to different gene transcripts.

In some embodiments, the reagents include at least one forward primerand/or a reverse primer capable of priming amplification of acompetitive template in the mixture. In some embodiments, at least onecompetitive template, forward primer and/or reverse primer comprises asequence referenced in Table 4, illustrated in FIG. 4.

In some embodiments, a forward primer and/or a reverse primer aredesigned to have substantially the same annealing temperature as anotherforward primer and/or reverse primer in the standardized mixture.Designing primers with the same or substantially the same annealingtemperature can allow amplification reactions to achieve approximatelythe same amplification efficiency under identical or substantiallyidentical conditions. In such embodiments, if there is variation inamplification efficiency, amplification efficiency of a nucleic acid andits competitive template can be affected identically (or substantiallyidentically), so that the ratio of amplified product of the nucleic acidand its corresponding competitive template may not vary or may notsubstantially vary. In some specific embodiments, a forward and reverseprimer have the same or substantially the same annealing temperature aseach of the other forward and reverse primers in a given standardizedmixture. For example, the annealing temperature may be about 40° C.,about 40° C., about 44°, about 50° C., about 55° C., about 57° C., about58° C, about 59° C., about 60° C., about 65° C., about 70° C., about 75°C., or about 85° C.

In some embodiments, an internal standard competitive template can beprepared for a number of nucleic acids to be evaluated, includingnucleic acids that can serve as one or more reference nucleic acids. Thecompetitive templates can then be cloned to generate enough to assessamounts of a nucleic acid in more than about 10⁴ samples, in more thanabout 10⁵ samples, in more than about 10⁶ samples, in more than about10⁷ samples, in more than about 10⁸ samples; in more than about 10⁹samples, in more than about 10¹⁰ samples, in more than about 10¹¹samples, in more than about 10¹² samples, in more than about 10¹³samples, in more than about 10¹⁴ samples, or in more than about 10¹⁵samples.

The competitive templates can be carefully quantified and then mixedtogether to form a standardized mixture. In some embodiments, theforward primer and/or reverse primer can be selected to allow fordetection of about 10⁻¹⁰, about 10⁻¹¹, about 10⁻¹², about 10⁻¹³, about10⁻¹⁴, about 10⁻¹⁵, about 10⁻¹⁶, about 10⁻¹⁷, about 10⁻¹⁸ M or less ofthe nucleic acid to be measured. For example, the forward and/or reverseprimer can allow for the detection of about 600 molecules, about 60molecules or about 6 molecules of the nucleic acid in some embodiments.

In some embodiments, a standardized mixture of the instant invention canmeasure and/or enumerate less than about 1,000 molecules of nucleic acidin a sample, e.g., about 800, about 600, or about 400 molecules. In someembodiments, less than about 100 molecules (e.g., about 60 molecules),preferably less than about 10 molecules (e.g., about 6 molecules), ormore preferably less than about 1 molecule of a nucleic acid can bemeasured and/or enumerated in a sample. In some embodiments, astandardized mixture of the instant, invention can measure and/orenumerate less than about 10,000,000, less than about 5,000,000, lessthan about 1,000,000, less than about 500,000, less than about 100,000,less than about 50,000, less than about 10,000, less than about 8,000,less than about 6,000, less than about 5,000, or less than about 4,000molecules of a nucleic acid in a sample.

In some embodiments, the reagents for measuring amounts of nucleic acidsare stable. For example, the primers and/or competitive templates of astandardized mixture may comprise stable nucleic acid molecules, such asDNA. Reagents may be stable for at least about 20 years, at least about50 years, at least about 100 years, at least about 500 years, or atleast about 1,000 years. In preferred embodiments, a standardizedmixture of the present invention can provide reagents to measuresufficient nucleic acids corresponding to gene expression measurementsexpected to be made for at least about 20 years, at least about 50years, at least about 100 years, at least about 500 years, or at leastabout 1,000 years, e.g., at the current rate of gene expressionmeasurement (estimated to be about one billion assays a year (Aneconomic forecast for the gene expression markethttp://www.researchandmarkets.com/reports/5545)).

In some embodiments, long term storage of reagents and/or samplescomprising DNA can be achieved at −20 degrees C. In some embodiments,reagents and/or samples comprising RNA are stable for years frozen as anEtOH precipitate and/or in RnASE free water. In some embodiments,competitive templates are stably frozen for more than six years. In someembodiments, cDNA samples are stable for more than two years frozen at−20 degrees C.

A standardized mixture according to some embodiments of the presentinvention can be prepared to perform one or more of the methodsdescribed herein. For example, as described above, using a standardizedmixture, a nucleic acid can be assessed relative to one or more othernucleic acids (e.g., that can serve as controls for cDNA loaded into thereaction). Also as detailed above, a nucleic acid can be assessedrelative to its respective competitive template provided in thestandardized mixture.

In some embodiments, the standardized mixture can allow for detectionwith one or more of the sensitivities, one or more of the accuracies,one or more of the detection limits, and/or with more or more of thecoefficients of variation taught herein. Additional features of theprepared standardized mixture will be apparent to one of skill in theart, based on the disclosure herein.

B. Preparation of Series of Serially-Diluted Standardized Mixtures

Some embodiments of the invention provide a method for preparing aseries of serially-diluted standardized mixtures. In some embodiments,the one or more of the series of standardized mixtures comprisessufficient amounts of competitive templates for assessing amounts of anumber of nucleic acids in a number of samples, e.g., more than about10⁶ samples. In preferred embodiments, the standardized mixture allowsdirect comparison of the amounts between at least 2 of the samples. Morepreferred embodiments allow direct comparison of amounts assessed in atleast about 5 samples, at least about 10 samples, at least about 50samples, at least about 100 samples, at least about 500 samples, atleast about 1,000 samples, at least about 5,000 samples, at least about10,000 samples, at least about 50,000 samples, at least about 100,000samples, at least about 500,000 samples, at least about 1,000,000samples, at least about 5,000,000 samples, or at least about 10,000,000samples. In some specific embodiments, the standardized mixture allowsdirect comparison of amounts assessed in up to an unlimited number ofsamples.

The series of serially-diluted standardized mixtures may be obtained byserially diluting a standardized mixture, e.g., a standardized mixturedescribed above. For example, in some embodiments, one or more of theseries may contain sufficient reagents for assessing various numbers ofnucleic acids and/or for assessing various numbers of samples, e.g., asdetailed above. Similarly, in some embodiments, one or more of theseries of serially-diluted standardized mixtures can comprise any of thereagents of some embodiments of the standardized mixtures describedabove.

In preferred embodiments, a standardized mixture is diluted so that thecompetitive template for a first nucleic acid is at a series ofconcentrations relative to the competitive template for a second nucleicacid. In some embodiments, a standardized mixture is serially diluted10-fold, providing 10-fold serial dilutions of the competitive templatefor the first nucleic acid relative to the competitive template for thesecond nucleic acid. In some embodiments, at least two of the series ofconcentrations span about one order of magnitude, about 2 orders ofmagnitude, about 3 orders of magnitude, about 4 orders of magnitude,about 5 orders of magnitude, about 6 orders of magnitude, about 7 ordersof magnitude, or more. In some embodiments, the series of concentrationsincludes at least two, at least 3, at least 4, at least 5, or sixconcentrations selected from about 10⁻¹⁰ M, about 10⁻¹¹ M, about 10⁻¹²M, about 10⁻¹³ M, about 10⁻¹⁴M, about 10⁻¹⁵ M, and about 10⁻¹⁶M.

In some embodiments, one or more of the series of standardized mixturescan allow for detection with one or more of the sensitivities, one ormore of the accuracies, one of more of the detection limits, and/or withmore or more of the coefficients of variation taught herein, overvarious ranges of orders of magnitude, e.g., any of the orders ofmagnitude discussed herein.

III. Compositions for Assessing Nucleic Acid

Another aspect of the instant invention relates to compositions forassessing a nucleic acid in a sample, for example, compositionscomprising a standardized mixture or a series of serially-dilutedstandardized mixtures, e.g., as described above. Other aspects of theinstant invention relate to databases, e.g., databases comprising dataobtained using some embodiments of the methods and/or compositions ofthe present invention.

A. Database of Numerical Values

Another aspect of the instant invention is directed to a database. Forexample, some embodiments provide a database of numerical valuescorresponding to amounts of a first nucleic acid in a number of samples.

In preferred embodiments, the numerical values are directly comparablebetween the number of samples. For example, in some embodiments, thenumerical values are directly comparable between at least about 5samples, at least about 10 samples, at least about 50 samples, at leastabout 100 samples, at least about 500 samples, at least about 1,000samples, at least about 5,000 samples, at least about 10,000 samples, atleast about 50,000 samples, at least about 100,000 samples, at leastabout 500,000 samples, at least about 1,000,000 samples, at least about5,000,000 samples, or at least about 10,000,000 samples. In someembodiments, direct comparison involves comparing the numerical valuesto one another without using a bioinformatics resource. In someembodiments, a bioinformatics resource, e.g., a simple bioinformaticsresource, can be used.

FIG. 24 illustrates development and use of a database of numericalvalues of some embodiments described herein. At step 2401, measuredamounts are obtained by any methods of various embodiment of the instantinvention described herein to provide numerical values. For example, asstep 2401 illustrates, a nucleic acid can be assessed relative to aknown number of competitive template molecules for the nucleic acid thathave been combined into a standardized mixture. Such embodiments canfacilitate the reporting of nucleic acid measurement as a numericalvalue. For example, the numerical value can be obtained by calculating a“ratio of ratios” as described above. In some specific embodiments, eachvalue in the database has been made relative to an internal standardwithin a standardized mixture of internal standards.

In preferred embodiments, numerical values correspond to numbers ofmolecules of a given nucleic acid in a sample. In some embodiments,numerical values can be provided in units of (molecules of a firstnuclei acid)/(molecules of a second nucleic acid), e.g., where thesecond nucleic acid serves as a reference nucleic acid. In a specificembodiment, measurements are provided in units of (cDNA molecules of afirst nucleic acid)/(10⁶ cDNA molecules of a second nucleic acid).Numerical values in some embodiments, for example, may correspond toless than about 1,000 molecules of a nucleic acid in a sample, e.g., toabout 800, at to about 600, or to about 400 molecules. In someembodiments, numerical values may correspond to less than about 100molecules (e.g., to about 60 molecules), less than about 10 molecules(e.g., to about 6 molecules), or less than about 1 molecule of a nucleicacid in a sample. In some embodiments, numerical values may correspondto less than about 10,000,000, less than about 5,000,000, less thanabout 1,000,000, less than about 500,000, less than about 100,000, lessthan about 50,000, less than about 10,000, less than about 8,000, lessthan about 6,000, less than about 5,000, or less than about 4,000molecules of a nucleic acid in a sample.

The database of the instant invention can comprise numerical valuesvarying over a range. For example, in some embodiments, numerical valuescan vary over a range of less than about one order of magnitude, morethan about one order of magnitude, or more than about 2 orders ofmagnitude. In some embodiments, numerical values of measured amounts ofdifferent nucleic acids, e.g., mRNA levels expressed from two or moredifferent genes, can vary over a range of about 3 or more orders ofmagnitude, about 4 or more orders of magnitude, about 5 or more ordersof magnitude, about 6 or more orders of magnitude, or about 7 or moreorders of magnitude, e.g., spanning the about 7-log range of geneexpression of about 10⁻³, about 10⁻², about 0.1, about 1, about 10,about 10², about 10³, and about 10⁴ copies/cell. In some embodiments,numerical values of measured amounts of different nucleic acids can varyover a range of about 8 or more, about 9 or more, or about 10 or moreorders of magnitude, e.g., spanning an about 10-log range of geneexpression of about 10⁻³, about 10⁻², about 0.1, about 1, about 10,about 10², about 10³, about 10⁴, about 10⁵, or about 10⁶ copies/cell.Such ranges of gene expression may be important in detecting agents ofbiological warfare, for example.

In some embodiments, numerical values of the database correspond to lessthan about a two-fold difference in a nucleic acid between 2 of thesamples. In some embodiments, the numerical values correspond less thanabout a one-fold difference, less than about an 80% difference, lessthan about a 50% difference, less than about a 30% difference, less thanabout a 20% difference, less than about a 10% difference, less thanabout a 5% difference, or less than about a 1% difference.

Without being limited to a given hypothesis and/or theory, since thedata in some embodiments is standardized against a common mixture ofinternal standard competitive templates, direct comparisons arepossible. For example, as discussed above, in some embodiments, thenumerical values are directly comparable between a number of samples,e.g., samples obtained from different subjects and/or from differentspecies. In some embodiments the numerical values are directlycomparable between a number of samples measured in differentlaboratories and/or at different times. In preferred embodiments, suchcomparisons are possible without the use of a calibrator sample (e.g., anon-renewable-calibrator sample).

Two values can be described as being “directly comparable” where, e.g.,the numerical values of each describe the amounts relative to a commonstandard. As a readily understandable analogy, 10° C. is directlycomparable to 50° C. as both values are provided relative to the boilingpoint of water (100° C.). Using some embodiments provided herein, thenumber of cDNA molecules representing a gene in a given sample ismeasured relative to its corresponding competitive template in astandardized mixture, rather than by comparing it to another sample. Useof a common standardized mixture can provide the common standard and canfacilitate direct comparisons.

In contrast, using techniques such as real-time RT-PCR and/or microarrayanalysis (other than in combination with some embodiments of the instantinvention), nucleic acids being measured scale differently. For example,differences in hybridization melting temperatures between cDNA withbound polynucleotides (microarrays) or fluorescent probes (real-timeRT-PCR) cause measurements to scale differently. Consequently, relativeamounts of different nucleic acids in a specimen and/or between specimenmay not be directly comparable, e.g., it may not be possible to comparedifference in expression among many genes in a sample. Further,real-time PT-PCR and/or microarray analysis measurements may not providedirect information as to the number of molecules of a nucleic acidpresent in a sample.

Assessed amounts may also be corrected for one or more sources ofvariation, e.g., in accordance with various embodiments of the teachingsprovided herein. In some embodiments, the values in the database show acoefficient of variation of less than about 50%, less than about 30%,less than about 25%, less than about 20%, less than about 15%, less thanabout 10%, less than about 5%, or less than about 1% between 2 of moresamples. In some preferred embodiments, numerical values do not comprisea statistically significant number of false positives. In some preferredembodiments, numerical values do not comprise a statisticallysignificant number of false negatives. In more preferred embodiments,numerical values do not comprise false negatives.

In some embodiments, the database further comprises numerical valuescorresponding to amounts of a number of other nucleic acid(s) in thesamples, where said amounts are directly comparable. The number of othernucleic acids for which data is included in the database can be at leastabout 5, at least about 10, at least about 50, at least about 100, atleast about 500, at least about 1,000, at least about 5,000, at leastabout 10,000, at least about 50,000, at least about 100,000, at leastabout 500,000, at least about 1,000,000, at least about 5,000,000 or atleast about 10,000,000 other nucleic acids.

In some embodiments, the database of the instant invention can serve asa common databank, where measured amounts of nucleic acids (e.g. geneexpression measurements) are reported as numerical values that allow fordirect inter-experiment comparison. Step 2402 illustrates such adatabase. In preferred embodiments, the database establishes acontinuously expanding virtual multiplex experiment (i.e., data from anever-expanding number of nucleic acids, samples and/or specimens can beentered into a given database and compared directly to other data withinthe database). This can lead to synergistic increases in knowledge,e.g., knowledge regarding the relationship between gene expressionpatterns and phenotype.

More preferred embodiments of the instant invention can be used toprovide a common language for gene expression. Gene expression may bemeasured at the mRNA, protein, or functional level, preferably at themRNA level. For example, mRNA expression is regulated primarily by thenumber of transcripts available for translation. Because mRNA expressionis related primarily to copy number, one is able to develop an internalstandard for each gene and/or to establish a common unit for geneexpression measurement. For example, in a multi-institutional study,data generated by methods discussed herein were sufficiently sensitiveand reproducible to support development of a meaningful gene expressiondatabase, serving as a common language for gene expression.

Some embodiments provide a common language for gene expression acrossspecies. For example, primers can be identified that PCR amplify nucleicacids corresponding to both human and mouse genes, e.g., for at leastabout 20%, for at least about 30%, for at least about 50%, for at leastabout 80%, or for at least about 90% of genes common to human and mice.Primers can also be developed to obtain wider cross-species application,e.g., for amplifying nucleic acids corresponding to two or more′different species. For example, in some embodiments, primers canidentified that amplify nucleic acids corresponding to two or more ofhuman, rat, pig, horse, sheep, monkey, plant, fruit fly, fish, yeast,bacterial and/or viral genes.

In some embodiments, the database is web-based. In some embodiments, thedatabase invention finds use in experimental research, clinicaldiagnoses and/or drug development. Step 2403 illustrates this use. Forexample, in some embodiments, the database can be used to advancestudies on pathways of transcriptional control, and/or serve as a basisfor mechanistic investigation. For example, bivariate analysis ofindividual gene expression numerical values for transcription factorgenes and genes controlled by these transcription factors can improveunderstanding of gene expression regulation. In some embodiments, thiscan increase insight into control of gene expression, e.g., in normaland malignant cells.

In some embodiments, the numerical values of a database described hereincan be used in one, two, or more stages of drug development. Stages ofdrug development may include, e.g., drug target screening, leadidentification, pre-clinical evaluation (e.g., bioassay and/or animalstudy), clinical trial and patient treatment. Such applications aredescribed in more detail below.

B. Database of Numerical Indices

Some embodiments of the instant invention provide a database comprisingnumerical indices. The numerical indices can be obtained by mathematicalcomputation of 2 or more numerical values, where the numerical valuescorrespond to amounts of nucleic acids in a number of samples. In someembodiments, the database of the instant invention includes one or morenumerical indices provided in FIGS. 1, 2 and/or 4.

In preferred embodiments, the numerical indices are directly comparablebetween the samples. For example, in some embodiments, the numericalindices are directly comparable between at least about 5 samples, atleast about 10 samples, at least about 50 samples, at least about 100samples, at least about 500 samples, at least about 1,000 samples, atleast about 5,000 samples, at least about 10,000 samples, at least about50,000 samples, at least about 100,000 samples, at least about 500,000samples, at least about 1,000,000 samples, at least about 5,000,000samples, or at least about 10,000,000 samples. In some embodiments,direct comparison involves comparing the numerical indices to oneanother without a bioinformatics resource. In some embodiments, abioinformatics resource, e.g., a simple bioinformatics resource, can beused. In some specific embodiments, each measurement in the database hasbeen made relative to an internal standard within a standardized mixtureof internal standards.

As discussed above, nucleic acid measurements can be reported asnumerical values. The numerical values can be combined into numericalindices by mathematical computation to provide a numerical index, e.g.,allowing mathematical interaction among the numerical values. Forexample, in some embodiments, a numerical index is calculated bydividing a numerator by a denominator, the numerator corresponding tothe amount of one of 2 nucleic acids and the denominator correspondingto the amount the other of the 2 nucleic acids. In some embodiments, anumerical index is calculated by a series of one or more mathematicalfunctions. For example, a numerical index may be calculated by a formula(gene 1+gene 2)/(gene 3-gene4). A numerical index can be described asbalanced e.g., where it is computed by a formula having equal numbers ofnumerical values in the numerator as in the denominator. Methods forobtaining numerical indices that indicate a biological state, e.g., thatcan act as biomarkers by correlating with a phenotype of interest, aredetailed below.

In some embodiments, the numerical indices are directly comparablebetween a number of samples, e.g., samples obtained from differentsubjects and/or from different species. In some embodiments thenumerical indices are directly comparable between a number of samplesmeasured and/or enumerated in different laboratories and/or at differenttimes.

In some embodiments, the database of the instant invention can serve asa common databank, where measured amounts of nucleic acids (e.g. geneexpression measurements) are mathematically combined to providenumerical indices that allow for direct inter-experiment comparison. Inpreferred embodiments, the database establishes a continuously expandingmultiplex experiment (i.e., data from an ever-expanding number ofnucleic acids, samples and/or specimens can be used to calculatenumerical indices that are entered into a given database and compareddirectly to other data within the database).

As discussed above, in some embodiments, any measured nucleic acid orcombination of nucleic acids, including all measured nucleic acids, canbe used as the reference gene and data calculated using a firstreference nucleic acid can be re-calculated relative to that of anotherreference nucleic acid. In the case of numerical indices, the differencein value obtained after converting from one reference nucleic acid toanother can depend on how many numerical values are in the numerator andhow many are in the denominator. For example, in some embodiments, eachnumerical value in a numerical index may be converted to the newreference in calculating the index. In some embodiments, for example,where there are equal numbers of numerical values in the numerator anddenominator, conversion to a new reference may have no effect on therelative numerical index between samples and/or specimen.

In the case of balanced numerical indices where numerical valuescorrespond to gene expression measurements, the effect of a referencenucleic acid that varies in expression from one sample and/or specimento another can be neutralized. This can also occur in doing bivariateanalysis. In other embodiments, for example, where there are non-equalnumbers of numerical values in the numerator and denominator, therelative numerical index between samples and/or specimen may change inaccordance with a difference in relative numerical values for thereference nucleic acids between the samples and/or specimen.

In some embodiments, the database is web-based. In some embodiments, thedatabase invention finds use in experimental research, clinicaldiagnoses and/or drug development. For example, in some embodiments, thedatabase can be used to advance studies on pathways of transcriptionalcontrol, and/or serve as a basis for mechanistic investigation. Forexample, in some embodiments, at least one numerical index indicates abiological state. Numerical indices may correlate better with a givenbiological state, e.g., a given phenotype, than a numerical valuecorresponding to an individual nucleic acid (e.g., to an individualgene). For example, in some embodiments, the numerical indices of adatabase described herein can be used in one, two, or more stages ofdrug development. Such applications are described in more detail below.

IV. Applications

Another aspect of the instant invention relates to methods of usingnumerical values and/or indices in research, diagnostic and/or otherapplications.

A. Identification of Biomarkers

In some embodiments, methods for obtaining numerical indices areprovided. In preferred embodiments, the numerical index obtainedindicates a biological state. A “biological state” as used herein canrefer to a phenotypic state, for e.g., a clinically relevant phenotypeor other metabolic condition of interest. Biological states can include,e.g., a disease phenotype, a predisposition to a disease state or anon-disease state; a therapeutic drug response or predisposition to sucha response, an adverse drug response (e.g. drug toxicity) or apredisposition to such a response, a resistance to a drug, or apredisposition to showing such a resistance, etc. In preferredembodiments, the numerical index obtained can act as a biomarker, e.g.,by correlating with a phenotype of interest. In some embodiments, thedrug may be and anti-tumor drug. In preferred embodiments, use ofembodiments of the instant invention described herein can providepersonalized medicine.

In some embodiments, a method for obtaining a numerical index thatindicates a biological state comprises providing 2 samples correspondingto each of a first biological state and a second biological state;measuring and/or enumerating an amount of each of 2 nucleic acids ineach of the 2 samples; providing the amounts as numerical values thatare directly comparable between a number of samples; mathematicallycomputing the numerical values corresponding to each of the first andsecond biological states; and determining a mathematical computationthat discriminates the two biological states.

First and second biological states as used herein correspond to twobiological states of to be compared, such as two phenotypic states to bedistinguished. Examples include, e.g., non-disease (normal) tissue vs.disease tissue; a culture showing a therapeutic drug response vs. aculture showing less of the therapeutic drug response; a subject showingan adverse drug response vs. a subject showing a less adverse response;a treated group of subjects vs. a non-treated group of subjects, etc.

A numerical index that discriminates a particular biological state,e.g., a disease or metabolic condition, can be used as a biomarker forthe given condition and/or conditions related thereto. For example, insome embodiments, the biological state indicated can be at least one ofan angiogenesis-related condition, an antioxidant-related condition, anapotosis-related condition, a cardiovascular-related condition, a cellcycle-related condition, a cell structure-related condition, acytokine-related condition, a defense response-related condition, adevelopment-related condition, a diabetes-related condition, adifferentiation-related condition, a DNA replication and/orrepair-related condition, an endothelial cell-related condition, ahormone receptor-related condition, a folate receptor-related condition,an inflammation-related condition, an intermediary metabolism-relatedcondition, a membrane transport-related condition, aneurotransmission-related condition, a cancer-related condition, anoxidative metabolism-related condition, a protein maturation-relatedcondition, a signal transduction-related condition, a stressresponse-related condition, a tissue structure-related condition, atranscription factor-related condition, a transport-related condition,and a xenobiotic metabolism-related condition.

For example, in specific embodiments, numerical indices that indicatelung cancer (E. L. Crawford, K. A. Warner, S. A. Khuder et al., Biochem.Bioph. Res. Co. 293, 509-516 (2002); E. L. Crawford, S. A. Khuder, S. J.Durham et al., Cancer Res. 60, 1609-161 8 (2000); J. P. DeMuth, C. M.Jackson, D. A. Weaver et al., Am. J. Respir. Cell Mol. Biol. 19, 18-24(1998)), pulmonary sarcoidosis (M. G. Rots, R. Pieters. G. J. Peters etal., Blood 94, 3121-3128 (1999)) cystic fibrosis (J. T. Allen, R. A.Knight, C. A. Bloor and M. A. Spiteri, Am. J. Respir. Cell. Mol. Biol.21, 693-700 (1999)) and chemo-resistance in childhood leukemias (S.Mollerup, D. Ryberg, A. Hewer et al., A. Cancer Res. 59, 3317-3320(1999)) have been identified. In other specific embodiments, antioxidantand xenobiotic metabolism enzyme genes have been evaluated in humanbuccal epithelial cells; micro-vascular endothelial cell gene expressionhas been associated with scleroderma progression; membrane transportgenes expression has been studied in rat congestive heart failuremodels; immune resistance has been studied in primary human tissues;transcription control of hormone receptor expression has been studied;and gene expression patterns have been associated with carboplatinand/or taxol resistance in ovarian carcinoma and with gemcitabineresistance in multiple human tumors. Other specific examples include,e.g., identification of numerical indices for predicting responsivenessof colon cancer to 5-FU and for indicating one or more different stagesof bladder carcinoma. Embodiments of inventions described herein canaccelerate discovery of associations between gene expression patternsand biological states of interest, leading to better methods forpreventing, diagnosing and treating various conditions.

FIG. 25 illustrates use of numerical indices in identifying a biologicalstate.

Measuring nucleic acid amounts may be performed by any methods known inthe art and/or described herein. Preferably, the method used can measureand/or enumerate less than about 10,000 molecules, less than about8,000, less than about 6,000, or less than about 4,000, preferably lessthan about 1,000, less than about 800, less than about 600, or less thanabout 400 molecules, of a given nucleic acid in a given sample. In someembodiments, the measurements correspond to gene expressionmeasurements, e.g., levels of mRNA transcripts can be measured. Inpreferred embodiments, transcript levels, in particular, transcriptlevels of 2 or more genes, can be used to indicate a biological state.For example, microarray analysis has identified gene sets that areassociated with disease states and/or drug responses (D. A. Wigle, I.Jurisica, N. Radulovich et al., Cancer Res. 62, 3005-3008 (2002); M. E.Garber, O. G. Troyanskaya. K. Schluens et al., Proc. Natl. Acad. Sci.USA 98, 13784-13789 (2001); A. Bhattachaijee, W. G. Richards, J.Staunton et al., Proc. Natl. Acad. Sci. USA 98, 13790-13795 (2001); I.Hedenfalk, D. Duggan, Y. Chen et al., New Engl. J. Med. 344,539-548(2001); T. Sorlie, C. M. Perou, R. Tibshirani et al., Proc. Natl. Acad.Sci. USA 98,10⁸⁶⁹-10⁸⁷⁴ (2001); C. M. Perou, S. S. Jeffrey, M. van deRijn et al., Proc. Natl. Acad. Sci. USA 96,9212-9217 (1999)). Providingthe measured and/or enumerated amounts as numerical values is preferablyaccomplished by methods described herein, where the numerical values aredirectly comparable for a number of samples used.

In some embodiments, one or more of the nucleic acids to be measured areassociated with one of the biological states to a greater degree thanthe other(s). For example, in some preferred embodiments, one or more ofthe nucleic acids to be evaluated is associated with a first biologicalstate and not with a second biological state. A nucleic acid may be saidto be “associated with” a particular biological state where the nucleicacid is either positively or negatively associated with the biologicalstate. For example, a nucleic acid may be said to be “positivelyassociated” with a first biological state where the nucleic acid occursin higher amounts in a first biological state compared to a secondbiological state. As an illustration, genes highly expressed in cancercells compared to non-cancer cells can be said to be positivelyassociated with cancer. On the other hand, a nucleic acid present inlower amounts in a first biological state compared to a secondbiological state can be said to be negatively associated with the firstbiological state.

The nucleic acid to be measured and/or enumerated may correspond to agene associated with a particular phenotype. The sequence of the nucleicacid may correspond to the transcribed, expressed, and/or regulatoryregions of the gene (e.g., a regulatory region of a transcriptionfactor, e.g., a transcription factor for co-regulation).

In some embodiments, expressed amounts of more than 2 genes are measuredand used in to provide a numerical index indicative of a biologicalstate. For example, in some cases, expression patterns of about 50 toabout 100 genes are used to characterize a given phenotypic state, e.g.,a clinically relevant phenotype. See, e.g., Heldenfalk, I. et al. NEJM344: 539, 2000. In some embodiments of the instant invention, expressedamounts of at least about 5 genes, at least about 10 genes, at leastabout 20 genes, at least about 50 genes, or at least about 70 genes maybe measured and used to provide a numerical index indicative of abiological state. In some embodiments of the instant invention,expressed amounts of less than about 90 genes, less than about 100genes, less than about 120 genes, less than about 150 genes, or lessthan about 200 genes may be measured and used to provide a numericalindex indicative of a biological state. Specific examples of several ofthese embodiments include, e.g., identification of gene expressionpatterns associated with lung cancer (Crawford, E. L. et al. Normalbronchial epithelial cell expression of glutathione transferase P1,glutathione transferase M3, and glutathione peroxidase is low insubjects with bronchogenic carcinoma. Cancer Res., 60: 1609-1618, 2000;DeMuth, et al., The gene expression index c-myc×E2F-1/p21 is highlypredictive of malignant phenotype in human bronchial epithelial cells.Am. J. Respir. Cell Mol. Biol., 19: 18-24, 1998); pulmonary sarcoidosis(Allen, J. T., et al., Enhanced insulin-like growth factor bindingprotein-related protein 2 (connective tissue growth factor) expressionin patients with idiopathic pulmonary fibrosis and pulmonarysarcoidosis. Am. J. Respir. Cell Mol. Biol., 21: 693-700, 1999); cysticfibrosis (Allen, et al, supra); and chemoresistance in childhoodleukemias (Rots, M. G., et al., Circumvention of methotrexate resistancein childhood leukemia subtypes by rationally designed antifolates.Blood, 94(9): 3121-3128,1999; Rots, M. G., et al., mRNA expressionlevels of methotrexate resistance-related proteins in childhood leukemiaas determined by a competitive template-based RT-PCR method. Leukemia,14:2166-2175 (2000)).

Mathematically computing numerical values can refer to using anyequation, operation, formula and/or rule for interacting numericalvalues, e.g., a sum, difference, product, quotient, log power and/orother mathematical computation. As described above, in some embodiments,a numerical index is calculated by dividing a numerator by adenominator, where the numerator corresponds to an amount of one nucleicacid and the denominator corresponds to an amount the another nucleicacid. In preferred embodiments, the numerator corresponds to a genepositively associated with a given biological state and the denominatorcorresponds to a gene negatively associated with the biological state.In some embodiments, more than one gene positively associated with thebiological state being evaluated and more than one gene negativelyassociated with the biological state being evaluated can be used. Forexample, in some embodiments, a numerical index can be derivedcomprising numerical values for the positively associated genes in thenumerator and numerical values for an equivalent number of thenegatively associated genes in the denominator. As mentioned above, insuch balanced numerical indices, the reference nucleic acid numericalvalues cancel out. An example of a balanced numerical index include anumerical index for predicting anti-folate resistance among childhoodleukemias. Rots, M. G., Willey, J. C., Jansen, G., et al. (2000) mRNAexpression levels of methotrexate resistance-related proteins inchildhood leukemia as determined by a standardized competitivetemplate-based RT-PCR method. Leukemia 14, 2166-2175. In someembodiments, balanced numerical values can neutralize effects ofvariation in the expression of the gene(s) providing the referencenucleic acid(s). In some embodiments, a numerical index is calculated bya series of one or more mathematical functions.

Determining which mathematic computation to use to provide a numericalindex indicative of a biological state may be achieved by any methodsknown in the arts, e.g., in the mathematical, statistical, and/orcomputational arts. In some embodiments, determining the mathematicalcomputation involves a use of software. For example, in someembodiments, a machine learning software can be used.

In some embodiments, more than one sample corresponding to eachbiological state can be provided. For example, at least about 5 samples,at least about 10 samples, at least about 50 samples, at least about 100samples, at least about 500 samples, at least about 1,000 samples, atleast about 5,000 samples, at least about 10,000 samples, at least about50,000 samples, at least about 100,000 samples, at least about 500,000samples, at least about 1,000,000 samples, at least about 5,000,000samples, or at least about 10,000,000 samples may be provided.

In some embodiments, more than 2 biological states can be compared,e.g., distinguished. For example, in some embodiments, samples may beprovided from a range of biological states, e.g., corresponding todifferent stages of disease progression, e.g., different stages ofcancer. Cells in different stages of cancer, for example, include anon-cancerous cell vs. a non-metastasizing cancerous cell vs. ametastasizing cell from a given patient at various times over thedisease course. Cancer cells of various types of cancer may be used,including, for example, a bladder cancer, a bone cancer, a brain tumor,a breast cancer, a colon cancer, an endocrine system cancer, agastrointestinal cancer, a gynecological cancer, a head and neck cancer,a leukemia, a lung cancer, a lymphoma, a metastases, a myeloma,neoplastic tissue, a pediatric cancer, a penile cancer, a prostatecancer, a sarcoma, a skin cancer, a testicular cancer, a thyroid cancer,and a urinary tract cancer. In preferred embodiments, biomarkers can bedeveloped to predict which chemotherapeutic agent can work best for agiven type of cancer, e.g., in a particular patient.

A non-cancerous cell may include a cell of hematoma and/or scar tissue,as well as morphologically normal parenchyma from non-cancer patients,e.g., non-cancer patients related or not related to a cancer patient.Non-cancerous cells may also include morphologically normal parenchymafrom cancer patients, e.g., from a site close to the site of the cancerin the same tissue and/or same organ; from a site further away from thesite of the cancer, e.g., in a different tissue and/or organ in the sameorgan-system, or from a site still further away e.g., in a differentorgan and/or a different organ-system.

Numerical indices obtained can be provided as a database. Numericalindices and/or databases thereof can find use in diagnoses, e.g. in thedevelopment and application of clinical tests, as described below.

B. Micro-Array Screening

Another application of some embodiments of the instant invention relatesto use with screening techniques, e.g., screening techniques usingsolid-phase hybridization such as microarray analyses. For example, inspecific embodiments, relevant gene expression patterns can beidentified through microarray gene expression screening, and assayssuitable for analysis of subset of genes can follow this.

FIG. 26 illustrates the overall process relating to using micro-arrayscreens with embodiments of the instant invention. FIG. 26(a)schematically illustrates discovery of genes of interest usingmicro-arrays. In some embodiments a population of genes may be screenedto determine a subset of genes of interest, e.g., genes corresponding tonucleic acids associated with a first biological state but not with asecond biological state. In some embodiments, for example a subsetcomprising about 30, about 50, about 80, about 100, about 120, about150, about 200, about 250, or about 300 genes may be found to beassociated with a clinically relevant phenotype, e.g., disease vs.non-disease states, or any other biological states to be distinguished,as discussed above. The microarray analysis used may use any microarraysand microarray techniques known in the art and/or described herein. Oneor more of the nucleic acids so identified may then be evaluated inaccordance with some embodiments described herein.

FIG. 26(b) schematically illustrates evaluations of genes of interest,according to some embodiments described here. Briefly, as FIG. 26(b)illustrates, mRNA corresponding to one or more genes (e.g., 3 genes) canbe extracted and reverse transcribed, e.g., as discussed in detailabove. Again as discussed above, a cDNA sample may be quantitativelybalanced and combined with an appropriate standardized mixture, e.g.,comprising competitive templates for each of the genes to be evaluated.Native templates for each of the 3 genes may be co-amplified with itscorresponding competitive template in a given vessel. PCR amplificationcan be followed by electrophoresis to provide an electropheregram. Areasunder the cure can be used to obtain a “ratio of ratios” as detailedabove. Expression measurements for each of the 3 genes are provided as anumerical value.

Any other embodiments and or variations of these methods, e.g., asdisclosed herein, can be used, e.g., to allow for detection with one ormore of the sensitivities, one or more of the accuracies, one or more ofthe detection limits, and/or with more or more of the coefficients ofvariation taught herein. In preferred embodiments, methods employed canimprove the threshold of detection, sensitivity and/or coefficient ofvariation compared to micro-arrays. For example, analysis using someembodiments of the instant invention can avoid, reduce and/or controldifferences in melting temperatures between cDNA for each gene and theoligonucleotide or cDNA spotted on the array; differences in amount ofsample loaded; time of hybridization; stringency of wash; and/orparameters used to calibrate fluorescence intensity. Details ofexperiments comparing with methods of the instant invention withmicroarray analysis are provided in Example V below.

In some embodiments, nucleic acids corresponding to genes of interestare evaluated in samples corresponding to one or more biological states.For example, a sample corresponding to a first biological state and asample corresponding to a second biological state may be used. In someembodiments, amounts of nucleic acids corresponding to each of said twobiological states may be evaluated and/or enumerated, e.g., to providedata representative of the two biological states.

FIG. 26(c) schematically illustrates mathematical computation ofnumerical values obtained for the genes of interest. Numerical valuesobtained for 2 or more nucleic acids can be used to determine one ormore numerical indices. For example, again as detailed above, numericalvalues corresponding to each of a first and a second biological statecan be mathematically combined. A mathematical computation can bedetermined that indicates the biological state of interest, e.g., bydiscriminating the first and second biological states. FIG. 26(c)illustrates using software to perform the mathematical computations,again as provided in detail above.

FIG. 26(d) schematically illustrates use of such numerical indices in aclinical setting, e.g., as a biomarker for diagnoses and/or prognoses,as discussed in more detail below. When analyzing clinical samples thatare size limited, it is likely to be more cost effective to measure onlythose genes that contribute information on expression profiles thatdefine the biological state of interest. Accordingly, rather thanmeasuring expression of a large population of genes, e.g., about 40,000to about 80,000 genes, a smaller subset can be evaluated in clinicalsamples. Using some embodiments for evaluating nucleic acids hereinprovided, multi-gene measurements can be made on the smaller subset anddata used in biomarker tests, e.g, as numerical indices indicative ofthe biological state.

C. Diagnostic Applications

In some embodiments of the instant invention, a method of identifying abiological state is provided. In some embodiments, the method comprisesmeasuring and/or enumerating an amount of each of 2 nucleic acids in asample, providing the amounts as numerical values; and using thenumerical values to provide a numerical index, whereby the numericalindex indicates the biological state. In some embodiments, the numericalindex comprises a numerical index provided in FIGS. 1, 2 and/or 4

A numerical index that indicates a biological state can be determined asdescribed above in accordance with various embodiments of the instantinvention. The sample may be obtained from a specimen, e.g., a specimencollected from a subject to be treated. The subject may be in a clinicalsetting, including, e.g., a hospital, office of a health care provider,clinic, and/or other health care and/or research facility. Amounts ofnucleic acid(s) of interests in the sample can then be measured and/orenumerated.

Assessing nucleic acid amounts may be performed by any methods describedherein. Preferably, the method used can measure and/or enumerate lessthan about 10,000 molecules, less than about 8,000, less than about6,000, or less than about 4,000, preferably less than about 1,000, lessthan about 800, less than about 600, or less than about 400 molecules,of a given nucleic acid in a given sample. In cases where several genesare to be measured in a sample and/or specimen, preferred embodimentscan be practiced using small amounts of starting cellular material,e.g., using the amounts of material obtained from a diagnostic biopsysample, e.g., by the methods described in more detail above and/or asknown in the art. In more preferred embodiments, more than one gene canbe evaluated at the same time, and in highly preferred embodiments,where a given number of genes are to be evaluated, expression data forthat given number of genes can be obtained simultaneously. For example,in some embodiments, data obtained from primary lung cancer tissue canbe assayed. By comparing the expression pattern of certain genes tothose in a database, a chemotherapeutic agent that a tumor with thatgene expression pattern would most likely respond to can be determined.

In some embodiments, methods of the invention can be used to evaluatesimultaneously both an exogenous reporter gene and an endogenoushousekeeping gene, such as GAPDH RNA in a transfected cell, either invitro or in vivo. In some embodiments, for example, relative amounts ofexogenous cystic fibrosis transmembrane conductance regulator (CFTR)gene per cell can be measured. Although numerous different mutations inthe CFTR gene have been reported to be associated with disease, the mostcommon disease-associated mutation is a 3 base deletion at position 508.It is possible to prepare primers that result in amplification of one orother of the abnormal 508 deleted gene or the normal CFTR gene usingdescribed methods, e.g., Cha, R. S., Zarbl, H., Keohavong, P., Thilly,W. G., match amplification mutation assay (MAMA): application to thec-Ha ras gene, PCR methods and applications, 2:14-20 (1992). These canbe used with certain embodiments of the present invention to measureamounts of exogenous normal CFTR nuclei acid and/or amounts ofendogenous mutant CFTR gene.

Similarly, in some embodiments, methods of the invention can be used toquantify exogenous normal dystrophin gene in the presence of mutatedendogenous gene. In the case of dystrophin, the disease results fromrelatively large deletions. Using primers that span the deleted region,one can selectively amplify and quantitate expression from a transfectednormal gene and/or a constitutive abnormal gene for dystrophin. As willbe appreciated by those in the art, other genes associated with otherdiseases and/or conditions can also be evaluated in similar manner.

In some embodiments, methods described herein can be used to determinenormal expression levels, e.g., providing numerical values correspondingto normal gene transcript expression levels. Such embodiments may beused to indicate a normal biological state, at least with respect toexpression of the evaluated gene.

Normal expression levels can refer to the expression level of atranscript under conditions not normally associated with a disease,trauma, and/or other cellular insult. In some embodiments, normalexpression levels may be provided as a number, or preferably as a rangeof numerical values corresponding to a range of normal expression of aparticular gene, e.g., within ± a percentage for experimental error.Comparison of a numerical value obtained for a given nucleic acid in asample, e.g., a nucleic acid corresponding to a particular gene, can becompared to established-normal numerical values, e.g., by comparison todata in a database provided herein. As numerical values can indicatenumbers of molecules of the nucleic acid in the sample, this comparisoncan indicate whether the gene is being expressed within normal levels ornot.

In some embodiments, fore example, provide a method for identifying abiological state comprising assessing an amount a nucleic acid in afirst sample, and providing said amount as a numerical value whereinsaid numerical value is directly comparable between a number of othersamples. In some embodiments, the numerical value is directly comparableto at least about 5, at least about 10, at least about 100, at leastabout 1,000, at least about 5, 000, or at least about 10,000 othersamples. In some embodiments, the numerical value is potentiallydirectly comparable to an unlimited number of other samples. Samples maybe evaluated at different times, e.g., on different days; in the same ordifferent experiments in the same laboratory; and/or in differentexperiments in different laboratories,

In preferred embodiments, the biological state corresponds to a normalexpression level of a gene. Where the biological state does notcorrespond to normal levels, for example falling outside of a desiredrange, a non-normal, e.g., disease condition may be indicated, asdiscussed above.

V. Business Methods

Another aspect of the present invention relates to business methods,including business methods for providing gene expression measurementservices and for improving research and development.

A. Nuclei Acid Evaluation Services

FIG. 27 illustrates the overall process of some embodiments of abusiness method for evaluating nucleic acids. In preferred embodiments,the business provides gene expression measurements. The amounts and/orconcentrations of other nucleic acids can also be evaluated in someembodiments, e.g., as described by the methods herein.

Preferred embodiments measure an amount of a nucleic acid to providestandardized, reproducible gene expression measurements as a service.“Measuring an amount of a nucleic acid” can refer to running a givenassay for evaluating the nucleic acid. Measuring nucleic acid amountsmay be performed by any methods known in the art and/or describedherein. Preferably, the method used can measure and/or enumerate lessthan about 10,000 molecules, less than about 8,000, less than about6,000, or less than about 4,000, preferably less than about 1,000, lessthan about 800, less than about 600, or less than about 400 molecules,of a given nucleic acid in a given sample.

Step 2701 illustrates collecting specimen comprising nucleic acid, e.g.,from a customer. The nucleic acid material may be mRNA, cDNA, genomicDNA and/or any other nucleic acid material, as provided for herein.Customers may include pharmaceutical companies, universities and/orother research organizations, government agencies, as well asclinicians, medical practitioners and/or other health care providers, aswell as any entity desiring information regarding nucleic acidconcentration in a sample.

The specimen collected may be a specimen from any biological entitycomprising a nucleic acid. For example, specimen may be collected fromdifferent subjects and/or different species. In some embodiments, thespecimen comprises a human specimen. In some embodiments, the specimenis collected with or without identifying information. For example, insome embodiments, customers may send human specimen without annotatinginformation to preserve anonymousness.

Step 2702 illustrates collecting information selecting which nucleicacids in the specimen are to be measured. For example, in someembodiments, customers select a set of genes whose expression levels areto be evaluated, and send a request listing the selected genes alongwith the specimen for analysis. In some embodiments, nucleic acidsand/or genes available for analysis are listed on a website. An exampleof such a list, e.g., may be found atrww.geneexpressinc.com/assays_list.asp. In some embodiments, theinformation may be collected via a website.

In some embodiments, the business method further comprises collectinginformation attesting to compliance with investigative protocol. Forexample, a request for gene expression measurement may include anattestation that any primary human samples and/or specimen were obtainedunder approved and/or active investigative review board (IRB) protocol.In cases where no or negligible potentially identifying information isprovided, there may be no need to obtain an IRB protocol for thespecimen to be analyzed. Identifying information can include anyinformation that would identify the subjects that provided the specimenand/or samples being assessed.

In some embodiments, upon receipt of a specimen to be tested, a numberis assigned comprising basic information, such as, e.g., the date, thenumber received that day and/or other preliminary organizinginformation. In some embodiments, a label with some or all of suchinformation can be attached to the material. In preferred embodiments, aduplicate label can be provided to the customer, e.g., for theirrecords. In some embodiments, the basic information can be entered intoa log, e.g., with the customer's name and/or account number, e.g., forbilling purposes.

Step 2703 illustrates assessment of the collected specimen. For example,in some embodiments, RNA specimen may be assessed for quality. AnAgilent 2 100 RNA chip may be used for this purpose. In someembodiments, an approximate measurement of the amount of RNA providedand/or cDNA provided and/or obtained may be made. If there isinsufficient quality and/or quantity, the customer can be notified,e.g., asked to prioritize genes to be evaluated and/or asked to sendmore RNA and/or cDNA material.

In preferred embodiments, measurements can be obtained rapidly. Toobtain data rapidly, several nucleic acids may be assayed at the sametime or during overlapping time periods, e.g., to accommodate numeroussteps of measuring the amounts of nucleic acids in a given time period.In some embodiments, for example, an assay is performed at least about10 times per day, at least about 50 times per day, at least about 100times per day, at least about 500 times per day, at least about 1,000times per day, at least about 2,000 times per day, at least about 4,000times per day, at least about 5,000 times per day, at least about 10,000times per day, at least about 50,000 times per day, at least about100,000 times per day, at least about 500,000 times per day, at leastabout 1,000,000 times per day, at least about 5,000,000 times per day,at least about 10,000,000 times per day, at least about 50,000,000 timesper day, or at least about 100,000,000 times per day.

In preferred embodiments, one or more steps of the business model areautomated, e.g., to increase speed. For example, in some embodiments,one or more embodiments of the computer implemented methods describedabove may be used.

Step 2704 illustrates quantitative balancing of a cDNA sample, e.g., asdescribed herein, and which may be automated. In some embodiments, thebusiness method further comprises identifying which of the selectednucleic acids electrophorese together. For example, some embodiments usesoftware to identify which nucleic acids, e.g., cDNAs corresponding tovarious genes, can be electrophoretically separated if run together,e.g., to be separated simultaneously.

Software can be used to identify which genes may be electrophoresedtogether for the set of genes selected by a customer. As discussed inmore detail above, factors considered include length in base-pairs ofthe nucleic acid and its respective competitive template, as well as therelative lengths of various selected nucleic acids and the nucleic acidserving as a reference. For example, in some embodiments, primers andcompetitive templates can be designed to produce suitably-sizedamplified PCR products of the one or more of the various selectednucleic acids and/or their respective competitive templates. Inpreferred embodiments, nucleic acid identified as electrophoresingtogether can be run together and/or enumerated at the same time or aboutthe same time.

Step 2705 illustrates the selection of one of a series of standardizedmixtures for combining with a selected cDNA dilution, again as describedherein. The Mix can be selected to provide competitive templates foreach of the genes to be evaluated, e.g., genes selected by the customerand/or genes identified as electrophoresing together. In someembodiments, many nucleic acids are amplified in a given PCR reaction tospeed measuring. In more preferred embodiments, all the genes selectedto be measured by a given customer in a particular specimen and/orsample are measured simultaneously.

Step 2706 illustrates combining a Mix and a cDNA dilution along withtransfer to vessels for PCR amplification. In preferred embodiments, asufficient volume of PCR mixture for the anticipated number of nucleicacid measurements can be prepared. In some specific embodiments, forexample, a PCR mixture can contain reagents for performing selectiveamplification of the nucleic acids to be evaluation and thecorresponding competitive template, including, e.g., buffer, one or morethermostable polymerases, NTPs and/or dNTPs, cDNA and competitivetemplates.

In some embodiments, reaction preparation and/or transfer is automated.For example, an automated means to prepare and load chips can be used.For example, automated means comprising one or more steps ofpressurizing, loading markers, vortexing, and loading chip into aAgilent 2100 can be used in some embodiments. For example, a roboticliquid handler can be programmed to assess different reagent reservoirs,assemble PCR reaction mixtures, and distribute into various vessels,e.g., 96- and/or 384-well microplates. For example, in preferredembodiments, the robotic liquid handler can transfer a 1 μL aliquot towells of Agilent 2100 DNA 1000 chips, e.g., automatically dispensing asolution or combination of solutions into individual wells and/orvarying the spacing between sample probes (Varispan). In someembodiments, a liquid handler can be programmed to distribute a givenvolume of primers for the nucleic acid(s) to be measured into reactionvessels. In preferred embodiments, one pair of primers can be present ina given vessel, allowing amplification of a given nucleic acid itsrespective competitive template in that vessel. In preferredembodiments, the robotic liquid handler is able to communicate with oneor more other devices used in the process, e.g., as detailed below.

Amplification can take place in an amplification device. Theamplification device may comprise any of the systems described hereinand/or known in the art for amplifying nucleic acids. Some embodimentsof the instant business method use one or more thermocyclers. In somepreferred embodiments, thermocyclers having motorized and/or heated lidsare used, e.g., to allow oil-free thermocycling and/or automation. Forexample, some specific embodiments use two MJ 384-well blockthermocyclers in the MJ PTC -225 DNA Engine Tetrad System, which canfurther be expanded to four 384-well microplate block thermocyclers

In preferred embodiments, the thermocycler used is compatible with therobotic liquid handler used. For example, a Multiprobe II HT EX roboticliquid handler can communicate with one or more thermocyclers, e.g., tocoordinate lid-opening and/or closing with microplate insertion and/orremoval. In preferred embodiments, the robotic liquid handler can avoidcross-contamination of reaction vessels, can position filled vessels inblock thermocyclers for amplification, and/or can remove aliquots fromthe vessels following amplification.

Step 2707 illustrates transfer of the contents of reaction vessels to aseparation device, i.e., a device for separating amplified product ofthe nucleic acid being measured and its respective competitive template,e.g., in accordance with methods known in the art and/or detailedherein. Some embodiments use a microfluidic chip with a sipper thatmoves from well to well, aspirating and then electrophoreticallyseparating amplified product at a rate of, e.g., at least about every 10seconds, at least about every 20 seconds, at least about every 30seconds, at least about every 40 seconds, or at least about every 50seconds. Some embodiments allow analysis of a 384-well plate inapproximately three hours. In some embodiments, a combined throughput of4,608 measurements/24 hours can be achieved.

As described above, where amplified products are to be separated byelectrophoresis, the size of the competitive templates and/or referencenucleic acid(s) can be selected to differ from that of the targetnucleic acid. In some embodiments, primers are designed to amplify anucleic acid and its respective competitive template to give amplifiedproduct of suitable sizes, e.g., sizes that facilitate obtaining datarapidly. For example, designing primers that amplify different sizedproducts for different target nucleic acids can support automation andhigh-throughput applications, including capillary gel and microchannelCE. Other embodiments may use microarrays, microbeads, MALDI-TOF MSand/or real-time RT-PCR as detailed above.

For example, in some embodiments competitive templates and/or amplifiedproducts are at least about 20, at least about 25, at least about 30, atleast about 50, at least about 75, at least about 100, at least about150, at least about 200, at least about 250, at least about 300, atleast about 350, or at least about 400 base pairs. In some embodiments,amplified product are less than about 500, less than about 600, lessthan about 700, less than about 800, less than about 850, less thanabout 900, less than about 1,000, less than about 1,500, less than about1,800, less than about 2,000, less than about 2,200, or less than about2,500 base pairs. In some embodiments, amplified product correspondingto 1 gene, 2 genes, 3 genes, 4 genes, about 5 genes, about 10 genes,about 15 genes, about 20 genes or more can be separated and quantifiedin a given channel, e.g., using microchannel CE where different PCR NTand CT products have different sizes.

In a specific embodiment where amplified product are between about 200to about 800 base pairs, more than about 100 genes can be amplifiedand/or separated in a given CE channel. It can be appreciated that a96-channel CE device, for example, may be converted to automated, highthroughput (>300,000 standardized gene expression assays/24 hours)device using some embodiments of the instant invention. Similarly, useof a Caliper AMS90 SE30 device can provide more than about 1,000 nucleicacid measurements in about eight hours. Another specific embodiment usesamplified product between about 20 and bout 2,000 base pairs.

The Agilent 2100 Lab-on-a-chip 1000, for example, can separate bandswith approximately 10% difference in size. For example, in an about 150to about 850 base pair span, about 15 differently sized PCR products,(e.g. about 150, about 170, about 190, about 210, about 230, about 260,about 290, about 320, about 350, about 400, about 440, about 490, about540, about 600, about 660, about 730, about 800 base pairs) can beseparated in some embodiments. Using of an Agilent 2100 Bioanalyzer,electrophoresis can take at least about 1 minute, at least about 1.5minutes, or at least about 2 minutes. Running multiple channels on anAgilent 2100 chip can take at least about 5 minutes, or at least about10 minutes, or at least about 15 minutes. In some embodiments, runningmultiple channels can take less than about 15 minutes, less than about20 minutes, less than about 25 minutes, or less than about 30 minutes,e.g., using an Agilent 2100 Bioanalyzer. As a specific example, using 12channels and running two chips/hour, an 8 hour day of continuous usewould allow 12 channels/chip×1 chip/30 minutes x two 30 minutesegments/hour×1 PCR product/channel×8 hours=about 192 expressionmeasurements/day. Some such embodiments provide throughput capabilitiesthat of more than 4,000 gene expression measurements/24 hours.

In some embodiments, a higher throughput may be achieved by increasingthe number of channels/device, genes/channel and/or the number ofelectorphoresis devices used. As another specific example, using twoAgilent 2100's, and electrophoresing 4 genes/channel, the throughputcapacity in eight hours can be about 1,056 (2 Agilent 2100×12channels/chip×4 genes/channel×2 chips/hour×8 hours=1056). Using fourAgilent 2100's, in some embodiments, for example, can double throughputto about 2,112 measurements/eight hours. Other preferred embodiments mayhave 96 channels instead of 12, further increasing the number of genesthat may be measured/run. Other preferred embodiments can triplethroughput potential to about 6,336 measurements/eight hours.

Preferably, the separation device used in the practice of the presentbusiness method allows for miniaturization. Automation combined withminiaturization can lead to high throughput and further increase speed,as well as using only small amounts of nucleic acid (e.g., small amountsof cDNA) and/or other reagents

For example, in some embodiments, throughput capacity for geneexpression measurement is increased with the use of microfluidicseparation devices. Some highly preferred embodiments can use capillaryelectrophoreses (CE) devices, more preferably, microfluidic CE devices,such as an AMS 90 SE30 microfluidic device (Caliper/Zymar, Hopkinton,Mass., USA). For example, a highly preferred embodiment in the instantbusiness method involves electrophoretically separating and quantifyingend-point PCR products using an AMS90 SE30 device. In some embodiments,amplified product of the at least 2, at least 3, or at least about 4nucleic acids and corresponding competitive template can beelectrophoresed (separated and quantified) in a given microfluidicchannel of an AMS 90 SE30 device. In a specific embodiment, for example,a combined automated system of multiblock thermal cyclers and CE devicesallows over about 4,000 gene expression assays/24 hours.

In some embodiment, PRC reaction mixtures can be dispensed intomicoarrays. In some embodiments, nanoarrays and/or nanofluidictechniques may be used. The use of nanotechnology methods formanipulating small liquid volumes can further decrease PCR reactionvolumes, e.g., to about 200 nL, about 150 nL, about 100 nL, about 80 nL,about 50 nL, about 20 nL, about 10 nL, about 5 nL, or about 1 nL. See,e.g, Crawford, E. L., Warner, K. A., Khuder, S. A., et al. (2002)Multiplex standardized RT-PCR for expression analysis of many genes insmall samples. Biochem, Biophys. Res. Commun. 293, 509-516.

Step 2708 illustrates determining a ratio of amplified product tonucleic acid and its competitive template, e.g., as described in detailherein. Where the ratio is not within a desired ratio, software can beused to instruct a robotic liquid handler on how to set up the nextexperiment, as also described herein.

Step 2709 illustrates calculation of a numerical value, e.g., asdescribed herein, where a ratio within the desired range is obtained. Inpreferred embodiments, quantification is automated. For example, in someembodiments densitometic measurement of amplified product takes placeautomatically, e.g, as bands migrate past a laser/photomultiplier unit.For example, in some embodiments, the relative amounts of NT and CT canbe determined by densitometric quantification of intercalator dyestained bands, using peak areas. Use an Agilent 2100 electrophoresisdevice, e.g., can facilitate automated quantification.

Step 2710 illustrates providing data obtained back to the customer. Datamay be communicated by any suitable means. For example, the informationcan be provided via e-mail and/or hard copy. Other communicative meansinclude, e.g., a CD ROM, floppy disc, paper, or telephoniccommunication.

Step 2711 illustrates any remaining material being returned to thecustomer, as may be done in some embodiments. In certain embodiments,customers may be encouraged to provide some annotating information,e.g., upon acceptance of a manuscript containing the data forpublication, publication of the data, and/or public disclosure of thedata. Identifying information may then be collected at a time later thanthat of collecting the specimen. In some embodiments, annotatedstandardized gene expression measurements can be entered into adatabase, e.g., databases comprising numerical values and/or numericalindices, as also described herein.

Additional details for operating a business providing gene expressionservices are provided in Example VI.

Some embodiments further comprise a step of charging a fee, e.g.,charging a fee for a given nucleic acid measurement. In someembodiments, a fee of less than about U.S. $2 per measurement, less thanabout U.S. $1 per measurement, or less than about U.S. $0.5 permeasurement is charged. In some embodiments, a fee of less than aboutU.S. $10,000, less than about U.S. $1,000, less than about U.S. $100,less than about U.S. $50, less than about U.S. $20, or less than aboutU.S. $5 per nucleic acid measurement may be charged. In someembodiments, the fees charged customers may be used, at least in part,towards funding developments of other aspects of the instant invention,e.g., finding determination of new biomarkers, numerical values and/orindices indicating a biological state.

In preferred embodiments, the business method further comprises qualitycontrol features. For example, any embodiments of the methods and/orcompositions described herein can be used to accomplish quality control.For example, use of some embodiments of the database and the use ofinternal standard can simplify quality control, including qualitycontrol sought by regulatory agencies, such as the FDA (FDA guidancepaper on acceptable use of multigene expression measurement in drugdevelopment http://www.fda.gov/cdrh/oivd/guidance/1210.pdf) and/orCenters for Disease Control (CDC, Atlanta, Ga., USA), e.g., the ClinicalLaboratory Improvement Amendment (CLIA) standards. In some embodiments,the business method provides measurements with one or more of thesensitivities, one or more of the accuracies, one or more of thedetection limits, and/or with one or more of the coefficients ofvariation taught herein. In preferred embodiments, the coefficient ofvariation is less than about 15% for all or nearly all genes, and lessthan about 10% for most genes whose expression is measured. Samples maybe measured at different times, e.g., on different days; in the same ordifferent experiments in the same laboratory; and/or in differentexperiments in different laboratories, e.g., allowing comparisons inbivariate and/or multivariate analyses.

As a specific embodiment of the business method described herein, astandardized Expression Measurement (SEM) Center has been established atthe Medical College of Ohio (Toledo, Ohio, USA). The SEM center usesrobotic systems to conduct high throughout gene expression measurements,in accordance with some of the methods described herein, and isavailable for use at www.geneexpressinc.com.

B. Business Method for R&D Improvement

Some embodiments of the present invention provide a business method ofimproving drug development. For example, use of a standardized mixtureof internal standards, a database of numerical values and/or a databaseof numerical indices may be used to improve drug development.

FIG. 28 illustrates the overall process of some embodiments of abusiness method for improving drug development. Feature 2801 illustratesvarious stages of drug development. For example, stages can include drugtarget screening, lead identification, pre-clinical evaluation (e.g.,bioassay and/or animal studies), clinical trials and patient treatment.

In some embodiments, modulation of gene expression is measured and/orenumerated at one or more of these stages, e.g., to determine effect acandidate drug. For example, a candidate drug (e.g., identified at agiven stage) can be administered to a biological entity. The biologicalentity can be any entity capable of harboring a nucleic acid, asdescribed above, and can be selected appropriately based on the stage ofdrug development. For example, at the lead identification stage, thebiological entity may be an in vitro culture. At the stage of a clinicaltrial, the biological entity can be a human patient.

The effect of the candidate drug on gene expression may then beevaluated, e.g., using various embodiments of the instant invention. Forexample, a nucleic acid sample may be collected from the biologicalentity and amounts of nucleic acids of interest can be measured and/orenumerated. Preferably, methods are used that allow direct comparison ofthe amount of nucleic acid in the sample to other nucleic acidmeasurements, e.g., as described herein. For example, amounts can beprovided as numerical value and/or numerical indices.

An amount then may be compared to another amount of that nucleic acid ata different stage of drug development, and/or to a numerical valuesand/or indices in a database. This comparison can provide informationfor altering the drug development process in one or more ways.

Altering a step of drug development may refer to making one or morechanges in the process of developing a drug, preferably so as to reducethe time and/or expense for drug development. For example, altering maycomprise stratifying a clinical trial. Stratification of a clinicaltrial can refer to, e.g., segmenting a patient population within aclinical trial and/or determining whether or not a particular individualmay enter into the clinical trial and/or continue to a subsequent phaseof the clinical trial. For example, patients may be segmented based onone or more features of their genetic makeup determined using variousembodiments of the instant invention.

For example, consider a numerical value obtained at a pre-clinicalstage, e.g., from an in vitro culture that is found to correspond to alack of a response to a candidate drug. At the clinical trial stage,subjects showing the same or similar numerical value can be exemptedfrom participation in the trial. The drug development process hasaccordingly be altered, saving time, and costs.

Feature 2802 illustrates the development of databases of numerical datafrom one or more biological entities at various stages of drugdevelopment. Using methods of the instant invention, e.g., a commonstandardized mixture of competitive templates, data from variousspecimen, evaluated in different laboratories and/or at different time,can be entered in a database and compared directly.

Feature 2803, for example, illustrates a discovery database, e.g., adatabase comprising gene expression measurement made during bioassays.Feature 2804 illustrates a translational database, e.g., a databasecomprising gene expression measurements made during animal studies.Feature 2805 illustrates a clinical database, e.g., a database madeduring clinical trials. Such databases can facilitate communicationbetween various research groups and/or departments and co-ordination oftheir efforts, increasing synergy along the various steps of drugdevelopment.

Some embodiments of the present invention provide a business method ofimproving drug development using a database comprising numerical valuesand/or numerical indices, e.g., as described herein.

In some embodiments, a business method is provided that comprisesproviding a database of numerical values corresponding to measuredamounts of a nucleic acid in a number of samples where the numericalvalues are directly comparable between the number of samples; collectinga specimen of the nucleic acid from a biological entity administered acandidate drug at a stage of drug development; measuring an amount ofthe nucleic acid in a first sample of the specimen; directly comparingthe measured amount to at least one of the numerical values in thedatabase; and altering a step of drug development based on thecomparison.

In some embodiments, a business method is provided that comprisesproviding a database of numerical indices obtained by mathematicalcomputation of 2 numerical values corresponding to measured amounts of 2nucleic acids in a number of samples where the numerical indices aredirectly comparable between the number of samples; collecting a specimenof the 2 nucleic acids from a biological entity administered a candidatedrug at a stage of drug development; measuring amounts of each of the 2nucleic acids in a first sample of the specimen; using the 2 measuredamounts to mathematically compute a first numerical index; directlycomparing the first numerical index to at least one of the numericalindices in the database; and altering a step of drug development basedon the comparison.

Such databases can further improve the process of drug development,e.g., by facilitating comparison of a numerical index and/or value withdifferent biological states and altering a step of drug developmentaccordingly. For example, a numerical index and/or numerical valueobtained from potential subjects can used to segment a population and/orto determine whether a given patient is allowed to enter a trial or asubsequent phase. Such numerical values and/or indices may indicate abiological state, e.g., the biological state identifying subjects havinga reduced side effect to a given drug.

EXAMPLES Example I

The following example compares a non-“two-step” with a “two-stepapproach, in accordance with some embodiments of the instant invention.

Reagents

10×PCR buffer for the Rapidcycler (500 mM Tris, pH 8.3, 2.5 mg/μI BSA,30 mM MgCl₂ was obtained from Idaho Technology, Inc. (Idaho Falls, Id.).Thermo 10× buffer (500 mM KCl, 100 mM Tris-HCl, pH 9.0, 1.0% TritonX-100), taq polymerase (5 U/μl), oligo dT primers, RNasin (25 U/μl),pGEM size marker, and dNTPs were obtained from Promega (Madison, Wis.).M-MLV reverse transcriptase (200 U/μl ) and 5× first strand buffer (250mM Tris-HCl, pH 8.3, 375 mM KCl, 15 mM MgCl.sub.2, 50 mM DTT) wereobtained from GibcoBRL (Gaithersburg, Md.). NuSieve and SeaKem L Eagarose were obtained from FMC BioProducts (Rockland, Me.). TriReagentwas obtained from Molecular Research Center (Cincinnati, Ohio).RNase-free water was obtained from Research Genetics (Huntsville, Ala.).DNA 7500 Assay kit containing dye, matrix and standards was obtainedfrom Agilent Technologies (Palo Alto, Calif.). The lung adenocarcinomacell line, A549, was purchased from American Type Culture Collection(Rockville, Md.). RPMI-1640 cell culture medium was obtained from Sigma(St. Louis, Mo.). Universal Human Reference RNA was obtained fromStratagene (La Jolla, Calif.). Oligonucleotide primers were customsynthesized by Biosource International (Menlo Park, Calif.). G.E.N.E.system 1 and system 1a gene expression kits were kindly provided by GeneExpress National Enterprises, Inc. (Huntsville, Ala.). All otherchemicals and reagents were molecular biology grade.

RNA Extraction and Reverse Transcription

Total RNA from cells grown in monolayer was extracted according to theTriReagent Manufacturer Protocol. Universal Human Reference RNA wasprecipitated according to the manufacturer protocol. Approximately 1 μgtotal RNA was reverse transcribed using M-MLV reverse transcriptase andan oligo dT primer.

Non-Two-Step Approach

Gene expression measurements were performed using previously published(non-two-step) methods (see, e.g., Willey, J. C. et al., Am. J. Respir.Cell Mol. Biol. 19: 6-17,1998; Gene Express System1 Instruction Manual,Gene Express National Enterprises, Inc. www.genexnat.com 2000) withG.E.N.E. system 1 or system 1a gene expression kit (Gene ExpressNational Enterprises, Inc.). Briefly, a master mixture containingbuffer, MgCl₂, dNTPs, cDNA, competitive template (CT) mixture fromG.E.N.E. system 1 or system 1a kit and taq polymerase was prepared andaliquotted into tubes, e.g., 384-well mircroplate, containinggene-specific primers and cycled either in a Rapidcycler (IdahoTechnology, Inc.) or Primus HT Multiblock thermal cycler (MWG-BIOTECH,Inc., High Point, N.C.) or a PTC-100 block thermocycler with heated lidfor 35 cycles.

In each protocol of this example, the denaturation temperature was 94°C., the annealing temperature was 58° C., and the elongation temperaturewas 72° C. For the Rapidcyler, the denaturation time was 5 seconds, theannealing time was 10 seconds, the elongation time was 15 seconds andthe slope was 9.9. For the Primus HT Multiblock, the denaturation,annealing and elongation times were each 1 minute, the lid temperaturewas 110° C. and the lid pressure was 150 Newtons. PCR products wereevaluated on an agarose gel or in the Agilent 2100 Bioanalyzer (AgilentTechnologies, Inc.) as described below.

Two-Step Approach

In this example, gene expression measurements were obtained for 9 genes.PCR reactions were amplified in two rounds. In the first round, onereaction was set up containing buffer, MgCl₂, dNTPs, a previouslyprepared mixture of cDNA and competitive template mixture (1:1 cDNA fromA549 p85 and one of the competitive template mixes from G.E.N.E. system1a mix D), taq polymerase and primer pairs for the 9 genes. Thisreaction was cycled for 5, 8, 10 or 35 cycles. Mix D from G.E.N.E.system 1 contained 10⁻¹²M β-actin CT and 10⁻¹⁵M of CTs for the othergenes. The concentration of each primer in the primer mix was 0.05μg/μl. Following this amplification, this PCR product was diluted withwater for use as a template in round two.

In round two, a standardized mixture containing buffer, MgCl₂, taqpolymerase and a primer pair specific for a given gene was aliquottedinto tubes containing 1 μl of each of the following dilutions of PCRproduct from the first round: undiluted, 1/5, 1/10, 1/50, 1/100,1/1,000, 1/10,000, 1/100,000 and 1/1,000,000. These reactions werecycled 35 times and detected on an agarose gel or in the Agilent 2100Bioanalyzer as described below. Primer pairs used in this round wereselected from among the primer pairs used in round one. No additionalcDNA or competitive template mixture was added into the PCR reaction inround two, in this example.

For control non-two-step reactions, the mixture of cDNA and competitivetemplate mixture prepared for use in round one of the nine genes wasserially diluted prior to amplification: undiluted, 1/5, 1/10, 1/50,1/100, 1/1,000, and 1/10,000. A 1 μl aliquot of each of these dilutionswas combined with a 9 μl aliquot of a standardized mixture containingbuffer, MgCl₂, Taq polymerase and a primer pair specific for a givengene (0.05 μg/μl of each primer). These reactions were amplified withonly one round of 35 cycles.

Gene expression measurements were also obtained for another 96 Genesusing a two-step approach of some embodiments. Samples of cDNA derivedfrom Stratagene Universal Human Reference RNA and competitive templatemixes from G.E.N.E. system 1 (which contain CTs for 96 genes) were usedin these experiments. A solution containing primers for each of the 96genes represented by CTs in G.E.N.E. system 1 was included in the firstround reactions. This 96 gene primer mix was diluted so that theconcentration of each primer was 0.005 μg/μl. Every round one reactionwas cycled 35 times. Round one PCR products then were diluted 100-fold(1 μl of round one product was diluted into 99 μl water). One microliterof diluted round one PCR product was used in each round two reactionalong with primers for a given gene selected from among those amplifiedin round one, and cycled 35 times.

Control non-two-step reactions were conducted using samples of cDNAderived from Stratagene Universal Human Reference RNA and competitivetemplate mixes from G.E.N.E., system 1 as described above. For theseexperiments, no dilution of the cDNA or competitive template mix wasdone prior to amplification.

Electrophoresis and Quantitation

Agarose Gel Electrophoresis:

Following amplification, PCR products were loaded directly on to 4%agarose gels (3:1 NuSieve:SeaKem) containing 0.5 μg/ml ethidium bromide.Gels were electrophoresed for approximately one hour at 225V.Electrophoresis buffer was cooled and recirculated duringelectrophoresis. Gels were visualized with a Foto/Eclipse image analysissystem (Fotodyne, Hartland, Wis.). Digital images were saved on a PowerMac 7100/66 computer and Collage software (Fotodyne) was employed fordensitometric analysis (or were analyzed using Agilent 2100 Bioanalyzer(as discussed below)).

Quantification of gene expression was determined. First, the nativetemplate/ competitive template (NT/CT) ratio of a reference gene β-actinwas calculated, as well as and the NT/CT ratios for each of the genes tobe measured. Because the initial concentration of competitive templateadded into the PCR reaction was known, the initial NT concentrationcould be determined. Since each NT/CT ratio was based on anintercalating dye (ethidium bromide) staining of the PCR products andstaining intensity is affected by both the number of molecules presentand the length of the molecules in base pairs, NTs were arbitrarilycorrected to the size of the competitive template product prior totaking the NT/CT ratio. Heterodimers (HD), when measurable, werecorrected to the size of the competitive template and divided by two.One half of the HD value was added to the NT and one half was added tothe competitive template prior to taking the NT/CT ratio since onestrand of the HD comes from the NT and the other comes from the CT.Second, the calculated number of NT molecules for a given gene wasdivided by the calculated number of β-actin NT molecules to correct forloading differences.

For embodiments using the two-step approach, genes detected under eachcondition (varying dilution and/or round one cycle number) were measuredagainst β-actin detected under the same condition. For example, roundone of a two-step process contained primers for nine genes includingβ-actin and c-myc that can be used as reference nucleic acids. A1/100,000 dilution of the PCR reaction from round one was made and usedin round two. An aliquot of this dilution was used in round two toamplify both β-actin and c-myc. Under these conditions, c-myc wasmeasured as 3.40×10⁴ molecules/10⁶ β-actin molecules when cycled 35times in round one and 35 times in round two.

FIG. 29 illustrates the results for these experiments. Briefly, PCRreactions were amplified in the Rapidcycler. In round one, a 10 μlreaction mixture was prepared containing buffer, MgCl₂, dNTPs, apreviously prepared mixture of cDNA and competitive template mixture(1:1 cDNA from A549 p85 and G.E.N.E. system 1 mix D), Taq polymerase and1 μl of a 10× stock solution of 9 primer pairs (concentration of 0.05μg/μl). This reaction was cycled 5, 8, 10 or 35 cycles. Following roundone amplification, the PCR products were diluted for use as templates inround two. In round two, 10 μl of PCR reaction were prepared by placing9 μl of a master mixture containing buffer, MgCl₂, Taq polymerase and aprimer pair specific for one gene into tubes containing 1 μl of each ofthe following dilutions of PCR product from the round one: undiluted,1/5, 1/10, 1/50, 1/100, 1/1,000, 1/10,000, 1/100,000, and 1/1,000,000.These reactions were cycled 35 times. Primer pairs used in round twowere selected from among the primer pairs used in round one. Noadditional cDNA or competitive template mixture was added into the PCRreaction in round two. For non-two-step reactions, the mixture of cDNAand competitive template mixtures prepared for use in round one wasserially diluted prior to amplification: undiluted, 1/5, 1/10, 1/50,1/100, 1/1,000, and 1/10,000. These reactions were amplified in only oneround of 35 cycles. A 1 μl aliquot of each dilution was combined with analiquot of a mixture containing buffer, MgCl₂, Taq polymerase and aprimer pair specific for one gene (0.05 μg/μl of each primer).Quantification of gene expression was determined.

Agilent 2100 Bioanalyzer Microcapillary Electrophoresis:

Following amplification, 1 μl of each 10 μl PCR reaction was loaded intoa well of a chip prepared according to the manufacturer's protocol forthe DNA 7500 Assay. Briefly, 9 μl gel-dye matrix was loaded into thechip in one well and the chips were pressurized for 30 seconds. Twoadditional wells were filled with gel-dye matrix and the remaining wellseach were loaded with 5 μl of molecular weight marker. One microliter ofDNA ladder was loaded into a ladder well and 1 μl of PCR product wasloaded into each sample well. The chip was vortexed and placed into theAgilent 2100 Bioanalyzer. The DNA 7500 Assay program which was runapplies a current sequentially to each sample to separate products. DNAwas detected by fluorescence of the intercalating dye in the gel-dyematrix. NT/CT ratios were calculated from the area under the curve foreach PCR product and a size correction was made since, as with ethidiumbromide stained agarose gel electrophoresis, an intercalating dye wasused to detect DNA.

All statistical analyses were conducted using SPSS version 9.0 forWindows. A two-tailed Pearson Correlation test was conducted onlogarithmically transformed data to compare gene expression valuesobtained by using a non-two-step with those obtained by using a“two-step approach, in accordance with some embodiments of the instantinvention. The correlation was considered statistically significant ifthe p value was less than 0.05.

Results: Non-Two Step approach amplifying Nine Genes

FIG. 30 illustrates the results of experiments comparing non-two step(30A and 30C) with two-step approaches (30B and 30D), according to someembodiments of the instant invention. FIGS. 30A-D illustraterepresentative results of using a two-step vs. a non-two step process.FIG. 30A illustrates that, in a non-two step reaction using β-actinprimers, a dilution of the un-amplified PCR reaction mixture from roundone by more than 100, followed by one 35 cycle round ofPCR-amplification with one primer pair did not yield any detectableproduct. Lanes are as follows: Lane 1, pGEM size marker; lane 2, PCRreaction contained undiluted cDNA in which β-actin NT equivalent to300,000 molecules initial molecules of β-actin CT; lane 3, PCR reactionscontaining 1:5 diluted cDNA/CT mix, 60,000 molecules; lane 4, 1:10diluted cDNA/CT mix, 30,000 molecules; lane 5, 1:50 diluted cDNA/CT mix,6,000 molecules; lane 6, 1:100 diluted cDNA/CT mix, 3,000 molecules;lane 7, 1:1,000 diluted cDNA/CT mix, 300 molecules; lane 8, 1:10,000diluted cDNA/CT mix, 30 molecules.

FIG. 30B illustrates PCR products obtained from a two-step approach,using an aliquot of round one PCR product and β-actin primers. Lane 1,pGEM size marker; lane 2, 1/500th of the round one 10 μl PCR product (1μl of a 1:50 dilution), equivalent to 600 initial molecules of β-actinCT; lane 3, 1/1,000th round one PCR product, 300 molecules; lane 4,1/10,000th round-one PCR product, 30 molecules; lane 5, pGEM sizemarker, lane 6, 1/10,000 round one PCR product, 30 molecules; lane 7,1/100,000th round one PCR product, 3 molecules; lane 8, 1/1,000,000round one PCR product, 0.3 molecules; lane 9, 1/10,000,000th round onePCR product, 0.003 molecules.

FIG. 30C illustrates a non-two step reaction using catalase primers,where a dilution of the un-amplified PCR reaction mixture from round oneby more than 100, followed by one 35 cycle round of PCR-amplificationwith one primer pair did not yield any detectable product. That is,diluting unamplified PCR reaction mixture by more than 1,000 followed byone 35 cycle round of PCR did not yield product, as was the case withβ-actin (FIG. 30A) Lanes are as follows: Lane 1, pGEM size marker; lane2, PCR reaction contained undiluted cDNA and competitive template mix,equivalent to 3,000 molecules of catalase CT; lane 3, 1:5 dilutedcDNA/CT mix, 600 molecules; lane 4, 1:10 diluted cDNA/CT mix, 300molecules; lane 5, 1:50 diluted cDNA/CT mix, 60 molecules; lane 6, 1:100diluted cDNA/CT mix, 30 molecules; lane 7, 1:1,000 diluted cDNA/CT mix,3 molecules; lane 8, 1:10,000 diluted cDNA/CT mix, 0.3 molecules.

FIG. 30D PCR products obtained from a two-step approach, using analiquot of round one PCR product and catalase primers PCR products forthe second round. Lanes are as follows: Lane 1, pGEM size marker; lane2, 1/100th of the 10 μl round one PCR product (1 μl of a 1:10 dilution),equivalent to 30 molecules catalase CT; lane 3, 1/500th round one PCRproduct, 6 molecules; lane 4, 1/1,000th round one PCR product, 3molecules; lane 5, 1/10,000th round one PCR product, 0.3 molecules; lane6, 1/100,000th round one PCR product, 0.03 molecules; lane 7,1/1,000,000th round one PCR product, 0.003 molecules; lane 8,1/10,000,000th round one PCR product, 0.0003 molecules.

Two-Step Approach Amplifying Nine Genes

After 35 cycles of amplification in round one with primer pairs for ninegenes, aliquots of the PCR products were diluted and amplified withprimers for one of the nine genes. Bright, distinct bands were observedfor each gene. Thus, the same amount of cDNA and competitive templatemix that is used in a typical non-two-step reaction to measure one genein one round of amplification was used to obtain nine gene expressionmeasurements in two-step approaches.

Further, the round one PCR product can be diluted more than 100,000-foldfor c-myc, 1,000,000-fold β-actin, or 10,000,000-fold for catalase orand still be quantified following amplification with primer pairs forone gene in round two (FIG. 29). In contrast, when the cDNA andcompetitive template mix used in round one was diluted more than1,000-fold prior to amplification (or more than 100-fold for β-actin)and then amplified with a single primer pair for any one of these genesin a single round of 35 cycles, no detectable product was observed. Theamount of amplified product that could be diluted prior to round two andstill yield detectable product after round two was directly related tothe number of round one cycles.

Increasing the number of cycles used in round one increased the amountthe PCR product that could be diluted prior to round two and still bedetectable after round two amplification. Therefore, more geneexpression measurements can be made using a sample when it is amplifiedin the two-step approach (e.g., with 35 cycles used in each round) thanwhen fewer cycles (e.g., 5, 8 and 10 cycles) are used in round one, orwhen no second round is used. Details for each gene and each conditionare shown in FIG. 29. Representative gels of control non-two stepreactions and two-step reactions are shown in FIG. 30.

Two-Step Approach Amplifying 96 Genes

Gene expression values obtained by non-two-step and 96 gene two-stepreaction using cDNA derived from Stratagene Universal Human ReferenceRNA are shown in FIG. 4. Although 96 primer pairs were included in thetwo-step reactions, gene expression values for only 93 genes arereported because 1) each gene expression value is reported as moleculesof a given gene/10⁶ molecules of β-actin so β-actin values are notreported, 2) although two sets of reagents to measure GAPD geneexpression (GAPD CT1 and GAPD CT2) are included in the G.E.N.E. system 1kit, only GAPD CT1 was measured in this sample and, 3) reagents for onegene, BAX alpha, provided in the kit did not pass quality controltesting done by G.E.N.E., Inc. so this gene was not assessed in thisstudy. Bivariate analysis of gene expression values using the twoapproaches revealed a highly significant (p=0.001) positive correlation(r=0.993). The two approaches were reproducible shoeoing not significantdifferences in meassurements for more than 90% of genes assayed. (FIG.29).

FIG. 31 is a graph showing the correlation of gene expression valuesobtained by either 96 gene two-step or non-two-step appraoches. Samplesof cDNA derived from Stratagene Universal Human Reference RNA werecombined with CT mix (mixes B, C, D, E and F from G.E.N.E. system 1 wereused) and amplified either by uniplex StaRT-PCR or by 96 gene multiplexStaRT-PCR with primer pairs for all genes in G.E.N.E. system 1. Meanvalues are presented in FIG. 4 for the 93 genes that could be evaluated.Of these, 79 were measured by both non-two-step and two-step approachesand could be compared. Gene expression values are presented as moleculesof mRNA per 10⁻⁶ β-actin mRNA molecules. Values obtained by non-two stepmethods are plotted along the X axis and values obtained by two-stepmethods are plotted along the Y axis.

Two-Step Approach Measurements on Small Samples

FIGS. 1 and 2 indicate gene expression data obtained from small amountsof materials. FIG. 1 shows data collected from a fine needle aspirationbiopsy of non-small cell lung-cancer (NSCLC) primary tissues cells. Alldata was measured using the competitive template mixtures from the GENESystem 1 by 18 multiplex PCR.

FIG. 2 shows data collected from primary tissue cells from a lung donorwho had no disease of the lung. The gene expression was also collectedusing 96 gene multiplex PCR with the competitive template mixes from theGENE System 1.

Example II

The following example provides additional details of an overall processof evaluating gene expression measurements according to some embodimentsof the instant invention

Materials

1. Standardized RT-PCR reagents, including primers and standardizedmixtures are purchased from Gene Express, Inc. (GEI, Toledo, Ohio).

2. Buffer for Idaho Rapidcycler air thermocycler: 500 mM Tris-HCl, pH8.3, 2.5 μg/μL, BSA, 30 mM MgCl₂ (Idaho Technology, Inc., Idaho Falls,Id.).

3. Buffer for block thermocyclers, Thermo 10 X, 500 mM KCl, 100 mMTris-HCl, pH 9.0, 1.0% Triton X-100 (Promega, Madison, Wis.).

4. Taq polymerase (5 U/μL), Moloney Murine Leukemia Virus (MMLV) reversetranscriptase, MMLV RT 5× first strand buffer: 250 mM Tris-HCl, pH 8.3,375 mM KCl, 15 mM MgCl₂, 50 mM dithiothreitol, oligo dT primers, Rnasin,pGEM size marker, and deoxynucleotide triphosphates (dNTPs) also areobtained from Promega.

5. TriReagent is obtained from Molecular Research Center, Inc.(Cincinnati, Ohio).

6. Ribonuclease (Rnase)-free water and TOPO TA cloning kits are obtainedfrom Invitrogen (Carlsbad, Calif.). The quality of the RNase-free watercan be important for the efficient extraction of intact RNA. Forexample, inadequate DEPC treatment and/or inadequate removal of DEPCafter treatment can inhibit reverse transcription and PCR.

7. GigaPrep plasmid preparation kits are purchased from Qiagen (Texas).

8. Caliper AMS 90SE chips are obtained from Caliper Technologies, Inc.(Mountain View, Calif.).

9. DNA purification columns are obtained from QiaQuick (Qiagen,Valencia, Calif.).

RNA Extraction and Reverse Transcription

RNA Extraction: Cell suspensions can be pelleted, the supernatant pouredoff, and the pellet dissolved in TriReagent and extract (according tomanufacturer's instructions and previously described methods, see, e.g.,Bustin, S. A. (2000) Absolute quantification of mRNA using real-timereverse transcription polymerase chain reaction assays. J. Mol.Endorinol. 25, 169-193. The RNA pellet can be stored under ethanol at−80° C., or suspended in RNAse free water and frozen at −80° C. It maybe stored in this condition for years. The quality of the RNA can beevaluated on an Agilent 2100 using the RNA chip, according tomanufacturer's instructions.

Reverse Transcription: 1 μg total RNA can be reverse transcribed usingMMLV RT and an oligo dT primer as previously described. See, e.g.,Willey, J. C., Coy, E. L., Frampton, M. W., et al. (1997) QuantitativeRT-PCR measurement of cytochromes p450 1A1, 1B1, and 2B7, microsomalepoxide hydrolase, and NADPH oxidoreductase expression in lung cells ofsmokers and non-smokers. Am. J. Respir. Cell Mol. Biol. 17, 114-124. Forsmall amounts of RNA (e.g. less than about 100 ng), the efficiency ofreverse transcription may be improved with using Sensicrip™ rather thanMMLV reverse transcriptase, e.g., efficient reverse transcription may beobtained about 50 ng of RNA with Sensiscrip™. The reaction can beincubated at 37° C. for 1 h.

Synthesis and Cloning of Competitive Templates

Internal standard competitive templates (CTs) can be constructed basedon previously described methods. See, e.g., Willey, J. C., Crawford, E.L., and Jackson, C. M. (1998) Expression measurement of many genessimultaneously by quantitative RT-PCR using standardized mixtures ofcompetitive templates. Am. J. Respir. Cell Mol. Biol. 19, 6-17;Crawford, E. L., Peters, G. J., Noordhuis, P., et al. (2001)Reproducible gene expression measurement among multiple laboratoriesobtained in a blinded study using standardized RT (StaRT)-PCR. Mol.Diagn. 6, 217-225; and/or Celi, F. S., Zenilman, M. E., and Shuldiner,A. R. (1993) A rapid and versatile method to synthesize internalstandards for competitive PCR.

Nucleic Acids Res. 21, 1047.

Native Template Primer Design

Before a CT for a gene is constructed, a primer pair can be designedthat amplifies (preferably, efficiently amplifies) native cDNAcorresponding to the expressed gene. For example, primers can bedesigned with one or more of the following characteristics: (1) anability to amplify from about 200 to about 850 bases of the codingregion of genes of interest; (2) an annealing temperature of about 58°C. (tolerance of ±1° C.). Primer 3.1 software (Steve Rozen, Helen J.Skaletsky, 1996, 1997) Primer 3 can be used to design the primers (codeavailable athttp://www-genome.wi.net.edu/genome_software/other/primer3.html) in someembodiments. Primers were initially designed using Primer 3.1 softwareto amplify from about 200 to about 800 bases of the coding region oftargeted genes with an annealing temperature of about 58° C. (toleranceof ±about 1° C.). This allowed the PCR reactions in this example to berun under identical or nearly identical conditions and further allowsfor automation and high throughput applications, including microfluidiccapillary gel electrophoresis. For example, primer sequences and Genbankaccession numbers for genes certain genes are available atwww.geneexpressinc.com. Primers can also be designed to amplify fromabout 20 to about 2,000 bases, in other examples.

Native Template Primer Testing

Designed primers can be synthesized and used to amplify native templateof cDNA corresponding to the gene(s) of interest. The presence of asingle strong band after 35 cycles of PCR can verify that the primersare sufficiently efficient and/or specific for some embodiments. Forexample, primers can be tested using reverse transcribed RNA from avariety of tissues and/or cDNA clones known to represent the gene(s) ofinterest. In some embodiments, primer pairs that fail to amplify thetarget gene in any tissue or individual cDNA clone, e.g., less thanabout 10% of the time, can be redesigned and the process repeated.

Competitive Template Primer Design

A CT primer can be prepared according to previously described methodsand/or as illustrated in FIG. 3. FIG. 32 a illustrates Forward (stripedbar) and reverse (black bar) primers (approx 20 bp in length) that spana 150-850 bp region can be used to amplify the native template (NT) fromcDNA. Taq polymerase can synthesize DNA from these primers (dashedlines) using the NT.

FIG. 32 b illustrate that after testing that native template primerswork, a CT primer can be designed to be about 40 bp primer with thesequence for the reverse primer (black bar) at the 5′ end, and a 20 bpsequence homologous to an internal native template sequence (white bar)at the 3′ end, collinear with the reverse primer sequence. The 3′ end ofthis 40 bp primer can be designed to be homologous to a region about 50to about 100 bp internal to the reverse primer. The 5′ end of this about40 bp primer can hybridize to the region homologous to the reverseprimer, while the 3′ end can hybridize to the internal sequence.Furthermore, Taq polymerase can synthesize DNA using the primers boundat the 3′ end (dashed line) and not the primer bound at the 5′ end.

FIG. 32 c illustrates that in the next PCR cycle, the DNA newlysynthesized using the about 40 bp primer hybridized to the internalsequence can be bound to forward primer (striped bar), and a homologousstrand can be synthesized. FIG. 32 d illustrates that this can generatea double stranded CT with the reverse primer sequence about 100 bpcloser to the forward primer than occurs naturally in the NT. See, e.g.,Chomczynski, P. and Sacchi, N. (1993) Single-step method of RNAisolation by acid guanidinium thiocyanate-phenol-chloroform extraction.Anal. Biochem. 62, 156-159; Celi, F. S., Zenilman, M. E., and Shuldiner,A. R. (1993) A rapid and versatile method to synthesize internalstandards for competitive PCR. Nucleic Acids Res. 21, 1047).

Competitive Template Primer Testing

The prepared CT may be tested. For example, the CT primer can be pairedwith the designed forward primed and used to amplify CT from nativecDNA. Before each competitive template in this example was constructed,each primer pair in this example was tested using reverse transcribedRNA from a variety of tissues or individual cDNA clones known torepresent the gene of interest as a quality control. For primer pairsthat failed (about 10% of the time), new ones were designed and theprocess repeated. For each gene, a competitive template primer (a fusionoligo of about 40 bp) then was prepared. The 3′ end of each fusionprimer consisted of an about 20 base sequence homologous to a regionabout 50 to about 100 bases 3′ to the reverse primer. The 5′ end was the20 bp reverse primer.

Competitive Template-Internal Standard Production

For each of a number of genes to be assay, five 10 μL PCR reactions canbe set up, using the designed NT forward primer and the CT primer, andamplified for 35 cycles. The products of the five PCR reactions can becombined, electrophoresed on a 3% NuSieve gel in 1× TAE, and the band ofcorrect size cut from the gel and extracted using a QiaQuick method(Qiagen, Valencia, Calif). The purified PCR products can be cloned intoPCR 2.1 vector using TOPO TA cloning kits (Invitrogen, Carlsbad, Calif.)then can be transformed into HS996 (a T1-phage resistant variant ofDH10B). After cloning, transformation, and colonies can be plating on LBplates containing X-Gal, IPTG, and carbenicillin and 3 isolated whitecolonies selected. Plasmid minipreps can be prepared, EcoRI digestionperformed and the digested products electrophoresed on 3% SeaKemagarose. For those clones showing an insert based on EcoRI digestion, itcan be confirmed that the insert is the desired one by sequencing thesame undigested plasmid preparation using vector specific primers. Theclones with homology to the correct gene sequence and having 100% matchfor the primer sequences can be used in large-scale CT preparation andcan be included in standardized mixtures. For example, those that passthis quality control assessment can be used in the following steps.

Plasmids from each quality-assured clone then were prepared inquantities large enough (about 1.5 L) to allow for about 1 billionassays (approximately 2.6 mg). The plasmids were purified from theresultant harvested cells using the Qiagen GigaPrep kit. Plasmid yieldswere assessed using a Hoeffer DyNAQuant 210 fluorometer.

In this example, an aliquot of each plasmid preparation was againsequenced as a quality control. For each competitive template thatpassed the quality control steps outlined in this example, thesensitivity of the cloned CT and primers was assessed by performing PCRreactions on serial dilutions and determining the limiting concentrationthat still yielded a PCR product. In this example, only thosepreparations and primers that allow for detection of 60 molecules orless (e.g., a product obtained with 10⁻¹⁶ CT in 10 μl PCR reactionvolume) were allowed to be included into standardized competitivetemplate mixtures. In this example, most of the assays that weredeveloped had a sensitivity of about 6 molecules or less (e.g., morethan 80% of the CTs that were developed had a sensitivity of 6 moleculesor less or 10⁻¹⁷ M CT).

Preparation of Standardized Mixtures

Plasmids from quality-assured preparations were mixed into competitivetemplate mixtures representing either 24 or 96 genes. The concentrationof the competitive templates in the 24 gene standardized mixtures were4×10⁻⁹ M for β-actin CT, 4×10⁻¹⁰ M for GAPD (CT1), 4×10⁻¹¹ M for GAPD(CT2), and 4×10⁻⁸ M for each of the other CTs in this example.

The 24 gene competitive template mixes can be linearized by NotIdigestion prior to preparation of a series of serially-dilutedstandardized mixtures described below. For example, the mixes can beincubated with NotI enzyme at a concentraion of 1 unit/jig of plasmidDNA in about 15 mL of buffer at 37° C. or 12-16 hours. Four linearized24-gene competitive template mixes were combined in equal amounts toyield 96-gene competitive template mixes having concentrations of 10⁻⁹ Mfor β-actin, 10⁻¹⁰ M GAPD (CT1), 10⁻¹¹ M GAPD (CT2), and 10⁻⁸ M for theother CTs. These mixes then can be serially diluted with a referencegene CT mix, e.g., comprising the 10⁻⁹ M β-actin, 10⁻¹⁰ M GAPDH (CT1),10⁻¹⁰ M GAPDH (CT2) mix, yielding a stock series at concentrations of10⁻⁹ M for β-actin, 10⁻¹⁰ M for GAPD CT1, 10⁻¹¹ M for GAPD CT2, and10⁻⁸, 10⁻⁹, 10⁻¹⁰, 10⁻¹¹, 10⁻¹², and 10⁻¹³ M for the other CTs used inthis example.

These stock concentrations can be diluted 1,000-fold to provide workingdilutions, e.g., to yield a series of six serially-diluted standardizedmixtures (A-F) at concentrations of 10⁻¹² M for β-actin, 10⁻¹³ M forGAPD CT1, 10⁻¹⁴ M for GAPD CT2, and 10⁻¹¹ (A), 10⁻¹²(B), 10⁻¹³(C),10⁻¹⁴(D), 10⁻¹⁵(E), and 10⁻¹⁶ M (F) for the other CTs used in thisexample.

The following illustrates use of a series of serially-dilutedstandardized mixtures, in accordance with some embodiments of theinstant invention. In this example, “SMIS” refers to a standardizedmixture of internal standards, prepared in accordance with embodimentsof the instant invention.

A volume of cDNA sample (diluted to a level in balance with the amountof β-actin CT molecules in 1 μL of SMIS (6×10⁵) molecules) can becombined and mixed with an equal volume of the appropriate SMIS A-F,such that the NT/CT ratio for a nucleic acid being measured will begreater than about 1/10 and less than about 10/1. For example, if amongprevious samples, a gene has been expressed within a range of 10¹-10³molecules/10⁶ β-actin molecules, the gene will be measured using SMIS E.In contrast, if among previous samples, a gene has been expressed withina range of 10⁵-10⁷ molecules/10⁶ β-actin molecules, the gene will bemeasured using SMIS B. If the appropriate SMIS is not known for aparticular gene in a sample from a particular type of tissue, expressioncan be measured using both SMIS C and E. This allows measurement overfour orders of magnitude. For the rare samples that express the geneoutside of the expected ranges, a follow-up analysis with theappropriate CT mix can be performed. For example, for the few genesexpressed at very high or low level, analysis can be repeated with SMISA or F.

A 1 μL volume of the cDNA/SMIS mixture can be used for each geneexpression assay to be performed and can be combined with othercomponents of the PCR reaction mixture (e.g., buffer, dNTPs, Mg++, Taqpolymerase, H₂O). Tubes or wells can be prepared with a primer pair fora single gene to be measured. If products are to be analyzed by PE 310device, the primers can be labeled with appropriate fluor. Aliquots ofthis PCR reaction mixture can be placed into individual tubes eachcontaining primers for a single gene. Using this approach, the ratio ofCT for every gene in the mixture relative to its corresponding NT in thecDNA is fixed simultaneously. When aliquots of this mixture aretransferred to PCR reaction vessels, although there may be variations inloading volumes resulting from pipeting, variation is controlled in theNT/CT ratio for any gene relative to the NT/CT ratio for a referencegene. This approach also enables standardized expression measurement.

PCR Amplification

Each reaction mixture can be cycled either in an air thermocycler (e.g.,Rapidcycler (Idaho Technology, Inc., Idaho Falls, Id.) or blockthermocycler (e.g., PTC-100 block thermal cycler with heated lid, MJResearch, Inc., Incline Village, Nev.) for 35 cycles. In eitherthermocycler, the denaturation temperature is 94° C., the annealingtemperature is 58° C., and the elongation temperature is 72° C.

Separation and Quantification of NT and CT PCR Products

a. Agarose gel. Following amplification, the entire volume of PCRproduct (typically 10 μL) can be into wells of 4% agarose gels (3/1NuSieve: Sea Kem) containing 0.5 μg/mL ethidium bromide. Gels can beelectrophoresed for approx 1 h at 225 V in continuously chilled buffer,and then visualized and quantifying with an image analyzer (productsavailable from Fotodyne, BioRad). Following electrophoresis, therelative amount of NT and CT can be determined by densitometricquantification of bands that have been stained by an intercalating dye(e.g., ethidium bromide).

b. PE Prism 310 Genetic Analyzer CE Device. PCR products can beamplified with fluor-labeled primers. One microliter of each PCRreaction can be combined with 9 μL of formamide and 0.5-0.1 μL of ROXsize marker. Samples can be heated to 94° C. for 5 min and flash cooledin an ice slurry. Samples can be loaded onto the machine andelectrophoresed at 15 kV, 60° C. for 35-45 min using POP4 polymer andfilter set D. The injection parameters can be 15 kV, 5 sec. Fragmentanalysis software, GeneScan (Applied Biosystems, Inc., Foster City,Calif.) can be used to quantify peak heights that are used to calculateNT/CT ratios. No size correction need be performed where each DNAmolecule was tagged with one fluorescent marker from one labeled primer.

c. Agilent2100Bioanalyzer Microfluidic CE Device. The DNA7500or DNA 1000LabChip kit may be used. Following amplification, 1 μL of each 10 μL PCRreaction can be loaded into a well of a chip prepared according toprotocol supplied by manufacturer. DNA assay can be run, which applies acurrent to each sample sequentially to separate NT from CT. DNA can bedetected by fluorescence of an intercalating dye in the gel-dye matrix.NT/CT ratios can be calculated from area under curve (AUC) and one ormore size corrections can be made.

d. Caliper AMS 90 Microfluidic CE Device. The PCR reactions can be setup in wells of a 96- or 384-well microplate. Following amplification,the microplate can be placed in a Caliper AMS 90 and protocolrecommended by the manufacturer followed. The AMS 90 can remove andelectrophorese a sample from each well sequentially every 30 sec. The NTand CT PCR products can be separated and quantified. Where detection isthrough fluorescent intercalating dye, size correction need not benecessary.

e. AMLDI-TOF separation. A method for separating PCR products recentlywas described. Ding, C. and Cantor, C. R. (2003) A high-throughput geneexpression analysis technique using competitive PCR and matrix-assistedlaser desorption ionization time-of-flight MS. Proc. Natl. Acad. Sci.USA 100, 3059-3064. This method may be used to quantify productsresulting from amplification of cDNA in the presence of SMIS.

Calculation of Fene Expression—Calculating the Number of NT MoleculesPresent at the Beginning of PCR for Each Gene

The steps taken to calculate gene expression can be based ondensitometric measurement values for electrophoretically separated NTand CT PCR products such as those presented in FIG. 33. The calculationsbelow are based on the example in FIG. 33, measuring GST gene expressionrelative to β-actin in an actual bronchial epithelial cell (BEC) sample.A volume of SMIS containing 600,000 competitive template molecules forβ-actin and 6000 competitive template molecules for GST was included atthe beginning of the PCR reaction. For each gene, the NT and competitivetemplate amplify with the same efficiency. Thus, the β-actin gene NT/CTPCR product ratio allows determination of the number of β-actin NTcopies at the beginning of PCR and the target gene NT/CT ratio allowsdetermination of the number of target gene copies of the beginning ofPCR, as detailed in the steps below:

1. Correct NT PCR product area under the peak (AUP) to length of CT DNA.

2. Determine ratio of corrected NT AUP relative to CT AUP.

3. Multiply NT/CT value×number of CT molecules at beginning of PCR.

A calculation of β-actin molecules using above protocol is outlinedbelow:

1. 416/532(β-actin CT bp/NT bp)×42 (NT AUP)=33 (corrected NT value).

2. Correct β-actin NT AUP divided by β-actin CT AUP=0.37.

3. 0.37 (β-actin NT/CT)×600,000 (number of (β-actin CT molecules atbeginning of PCR)=222,000 NT molecules at beginning of PCR.

A calculation of GST molecules using above protocol is outlined below:

1. 227/359 (GST CT bp/NT bp)×1.5 (NT AUP)=0.95 (corrected NT AUP).

2. 0.95 (GST corrected NT AUP) divided by 4.4 (GST CT AUP)=0.22.

3. 0.22 (GST NT/CT)×6000 (number of GST CT molecules at beginning ofPCR)=1290 GST NT molecules at beginning of PCR.

Calculation of molecules of GST/10⁶ β-actin molecules is 1290 GST NTmolecules/222,000 β-actin NT molecules=580 GST molecules/10⁶ β-actinmolecules.

Example III

The following example provides additional details of a non-two-stepapproach for evaluating gene expression according to some embodiments ofthe instant invention

RNA Extraction

Purified deoxyribonucleotides obtained from Pharmacia (Piscataway, N.J.)were diluted to a stock solution of 10 mM. Recombinant Thermus aquaticusDNA polymerase (Taq polymerase), Avian myeloblastosis virus (AMV)reverse transcriptase, and ribonuclease inhibitor (RNasin) were obtainedfrom Promega (Madison, Wis.). EcoRI enzyme was obtained from USB(Cleveland, Ohio). Primers were prepared on an Applied Biosystems model391 PCR-Mate EP TM synthesizer. PCR was performed in a Perkins, Elmer,Cetus DNA Thermal Cycler 480. The other buffers and solutions used werefrom various sources and were molecular biology grade.

Studies were performed on a human papillomavirus-immortalized humanbronchial epithelial cell line (BEP2D) (Willey et al, Cancer Res. 51:5370-5377, 1990). The isolation of RNA was as follows: RNA wasisolated based on the method described by Chomczynski and Sacchi(Analytical Biochemistry 1 6 2:156-159, 1987) Culture medium was removedfrom flasks containing the BEP2D cell line. Immediately GIT (4.0 Mguanidinium thiocyanate, 0.1M Tris Cl Ph=7.5, 1% beta-mercaptoethanol)buffer was placed on the cells (approximately 500 μL per 5-10 millionBEP2D cells). Each 5.00 μL of GIT buffer containing the lysed cells wasthen transferred to a 1.5 mL microfuge tube. To each microfuge tube wasadded sequentially 50 μL of 2M Na acetate pH=4, 500 mL of watersaturated phenol and 100 mL of chloroform-isoamyl alcohol mixture(49:1). The tubes then were shaken thoroughly, placed on ice for 15 min,and microcentrifuged for 20 min at 14,000 RPM and 4° C. The aqueousphase of each tube was transferred to a fresh tube and the aboveextraction was repeated. Again, the aqueous phase of each tube wastransferred to a fresh tube with isopropanol (500 μL), and placed at−70° C. for 15 min. The tubes were then microcentrifuged for 20 min at14,000 RPM and 4° C. The RNA was washed twice with 70% ethanol andvacuum dried. RNA was taken up in 0.1% diethyl pyrocarbonate (DEPC)treated H₂O and quantified by spectrophotometry (Gilford InstrumentSpectrophotometer 260).

Reverse Transcription

The reverse transcription was conducted as follows: the extracted RNAwas placed in a sterile microfuge tube. For each 1 μg of RNA, 0.2 mgoligo dT was added. This was heated at 65° C. for 5 min and placed onice for one min. To this was added 2 μL 1-mM dNTP's, 2 μL reversetranscriptase (RT) buffer (500 mM Tris, 400 mM KCl, and 80 mM MgCl₂),0.5 μL RNasin, and 1 μL AMV reverse transcriptase (9.500 units/ml). Thiswas incubated at 42° C. for one hour and heated to 80° C. for 10 min tohalt the reaction. Resultant cDNA was stored at −20° C.

Preparation of Primers and CTs, PCR Amplification and GelElectrophoresis

The preparation of primers and competitive templates was as follows:suitable sequences were identified using the Oligo-TM Primer AnalysisSoftware (National Biosciences, Hamel, Minn.). The primers were madeusing an Applied Biosystems Model 391 PCR-Mate DNA Synthesizer. Theprimer sequences are described below.

Glutathione Peroxidase (GSH-Px) (Chada et al., Genomics 6:268-271, 1990)

The “outer” primers were used to amplify both the nucleic acid to bemeasured and its competitive template and result in a product length of354 base pairs. The “outer” primers are

Sequence I.D. No. 1 (Chada et al., Genomics 6:268-271, 1990) Pos. 2415′-GGGCCTGGTGGTGCTTCGGCT-3′ (coding sense) which corresponds to bases241-261 of the cloned sequence, and Sequence I.D. No. 2 (Chada et al.,Genomics 6:268-271, 1990) Pos.574 5′-CAATGGTCTGGAAGCGGCGGC-3′(anti-coding sense) which anneals to bases 574-594.

The “inner” primers used to synthesize the mutated competitive templateremove an EcoRI restriction endonuclease recognition site (GAATTC) bychanging a native cDNA base pair (bold bases). The “inner” primers are

Sequence I.D. No. 3 (Chada et al., Genomics 6:268-271, 1990) Pos. 3095′-ATTCT GATTTC CCTCAAGTACGTCCGGCCT-3′ (coding sense)

Sequence I.D. No. 4 (Chada et al., Genomics 6:268-271, 1990) Pos. 3093′-TAAGA CTAAAG GGAGTTCATGCAGGCCGGA-5′ (anti-coding sense).

Both primers correspond to bases 309-338 of the cloned sequence. Themutation results from the substitution of a T for the native A atposition 316 of the sense strand. Restriction endonuclease digestion ofthe native GSH-Px yields products of 280 and 74 base pairs.

Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH) (Tso et al., NucleicAcids Res. 13:2485-2502, 1985)

The “outer” primers used to amplify both the native and mutatedtemplates result in a product length of 788 or 790 base pairs. The“outer” primers are:

Sequence I.D. No. 5 (Tso et al., Nucleic Acids Res. 13:2485-2502, 1985)Pos. 46 5′-GGTCGGAGTCAACGGATTTGGTCG-3′ (coding sense) corresponding tobases 9-32 of the cloned sequence, and Sequence I.D. No. 6 (Tso et al.,Nucleic Acids Res. 13:2485-2502, 1985) Pos. 8125′-CCTCCGACGCCTGCTTCACCAC-3′ (anti-coding sense) which anneals bases777-798.

The “inner” primers used to synthesize the mutated template create anEcoRI restriction endonuclease recognition site (GAATTC) by changing onenative cDNA base pair (bold bases). The “inner” primers are:

Sequence I.D. No. 7 (Tso et al., Nucleic Acids Res. 13:2485-2502, 1985)Pos. 234 5′-TGATCAATG GAATTC CCATCACCA-3′ (coding sense)

Sequence I.D. No. 8 (Tso et al., Nucleic Acids Res. 13:2485-2502, 1985)Pos. 234 3′-ACTAGTTAC CTTAAG GGTAGTGGT-5′ (anti-coding sense)

Both primers correspond to bases 199-222 of the cloned sequence. Themutation results from -the substitution of a T for the native A atposition 211 of the sense strand. Restriction endonuclease digestion ofthe mutated GAPDH yields products of 588 and 200 base pairs.

Several experiments were performed using a different mutated GAPDHtemplate. This template had a novel BamHI restriction site introduced.

The “outer” primers used to amplify both the native and mutatedtemplates result in a product length of 634 base pairs. The “outer”primers are:

Sequence I.D. No. 9 (Tso et al., Nucleic Acids Res. 13:2485-2502, 1985)Pos. 200 5′-CATGGCACCGTCAAGGCTGAGAAC-3′ (coding sense) corresponding tobases 165-188 of the cloned sequences, and Sequence I.D. No. 10 (Tso etal., Nucleic Acids Res. 13:2485-2502, 1985) Pos. 8135′-CCTCCGACGCCTGCTTCACCAC-3′ (anti-coding sense) which anneals to bases777-798.

The “inner” primers used to synthesize the mutated template create aBamHI restriction endonuclease recognition site (GGATCC) by changing onenative cDNA base pair (bold bases). The “inner” primers are:

Sequence I.D. No. 11 (Tso et al., Nucleic Acids Res. 13:2485-2502, 1985)Pos. 368 5′-CAGGGG GGATCC AAAAGGGTCATCAT-3′ (coding sense)

Sequence I.D. No. 12 (Tso et al., Nucleic Acids Res. 13:2485-2502, 1985)Pos. 368 3′-GTCCCC CCTAGG TTTTCCCAGTAGTA-5′ (anti-coding sense)

Both primers correspond to bases 333-358 of the cloned sequence. Themutation results from the substitution of a T for the native G atposition 342 of the sense strand. Restriction endonuclease digestion ofthis mutated GAPDH yields products of 460 base pairs and 174 base pairs.

The mutated internal standard competitive templates were prepared bysite directed mutagenesis as described by Higuchi et al., Nucleic AcidsRes. 16:7351-7367, 1988. These single base mutations resulted in eitherthe gain (GAPDH) or loss (GSH-Px) of an EcoRI restriction endonucleaserecognition site. (Experiments were also conducted using a muted GAPDHwith a BamHI site introduced). For each mutated product, two initialpolymerase chain reactions using an “outer” primer and an “inner” singlebase mismatched primer produce two overlapping DNA fragments. (Primers 1and 4, 2 and 3 for GSH-Px; Primers 5 and 8, 6 and 7 for GAPDH). Theseoverlapping DNA fragments were electrophoresed on a 3% Nusieve, 1% LEagarose ethidium bromide stained gel. Bands were excised and purifiedusing Millipore Ultrafree-MC 0.45 μM filter (Nihon Millipore Kogyo K.K., Yonezawa, Japan). The purified DNA was ethanol precipitated, washed,vacuum-dried and taken up in 100 μL sterile dH₂0. 1 μL of each of thetwo overlapping DNA fragments were PCR amplified using the outer primersonly. The first PCR cycle was performed without primers to allow forheterodimer formation. The entire mutated product was thus formed andamplified. The mutated PCR product was gel purified as described aboveand re-amplified to form bulk product. The bulk product was gel purifiedand measured spectrophotometrically. The mutated products were dilutedto the attomolar range for use as competitive templates. Herring spermDNA (Lofstrand, Bethesda, Md.) 1 μg/ml was used as a carrier.Restriction endonuclease digestion was performed on samples of eachmutated template to assure lack of contamination.

The PCR conditions were as follows: The PCR conditions were standardizedfor each experiment by using a master mixture containing 1× PCR buffer(50 mM KCl, 10 mM Tris-HCl, pH 9.0, 1.5 mM MgCl₂), 25 pmoles of primerscoding for GSH-Px and GAPDH, 0.2 mM dNTP's (A,T,C,G), and constantamounts of both internal standards per 100 μL reaction mixture. Taq DNApolymerase (2.5 units) was added to each 100 μL reaction prior toamplification. cDNA obtained from the BEP2D cell line was seriallydiluted and added to the sample PCR tubes. In all experiments, controltubes containing no template, native cDNA only, or mutated competitivetemplates only were amplified to check for contamination or completeenzyme digestion.

PCR amplification was carried out for 35 cycles at 94° C. for one min,60° C. for one min, and 72° C. for one min. After amplification, PCRproducts were heated for 10 min in order to maximize heterodimerformation.

The quantification of products was as follows: Samples (40 μL) for eachPCR tube were EcoRI restriction endonuclease digested for 12-16 hours(Experiments conducted using mutated GAPDH with the novel BamHIrestriction site were also BamHI restriction endonuclease digested for4-5 hours). These products were isolated by electrophoresing on a 3%Nusieve, 1% LE agarose ethidium bromide stained gel for 2-3 hours at 60V. A negative photograph was taken of the gel using Polaroid 665positive/negative instant film.

The negative photograph was subjected to densitometry (Zeineh Soft LaserScanning Densitometer Model SLR 2D/1D using Zeineh 1D AutostepoverVideophoresis Program Software, Biomed Instruments, Fullerton, Calif.).Alternatively, the stained gel is evaluated densitometrically directlyusing a digital camera, or evaluated on an automated sequencing gel(such as that offered by Applied Biosystems, Inc.). Areas under eachcurve were calculated and used for quantification. Corrections were madefor relative band sizes and heterodimer formation. Data were expressedGSH-Px to GAPDH relative ratios.

In a second set of experiments, multiplex competitive reversetranscriptase polymerase chain reaction (MC RT-PCR) with competitivetemplates were prepared by the Celi method to evaluate the cytochromep450 (CYP) IAI gene in β-napthoflavone-exposed BEP2D cells. Theinduction of CYPIAI gene expression was evaluated using both MC RT-PCRwith Celi competitive templates, and Northern analysis. Competitivetemplates were prepared for both the CYPIAI and GAPDH genes. The primersused to prepare the competitive template for GAPDH were:

Sequence I.D. No. 13 (Tokunaga et al., Cancer Res. 47:5616-5619, 1990)Pos. 75 5′-GGT CGG AGT CAA CGG ATT TGG TCG-3Pos. 94 and Sequence I.D.No. 14 (Tokunaga et al., Cancer Res. 47:5616-5619, 1990)

-   -   Pos. 822.backslash./Pos.636

Pos. 842 5′-CCT CCG ACG CCT GCT TCA CCC CAT CAC GCC ACA GTT TCC C-3′Pos.616

The lower outer primer used in conjunction with Sequence I.D. No. 13 toamplify both the competitive and native templates was

Sequence I.D. No. 15 (Tokunaga et al., Cancer Res. 47:5616-5619, 1990)Pos. 842 5′-CCT CCG ACG CCT GCT TCA CC-3′ Pos. 822. The primers used toprepare the competitive template for CYPIAI were:

Sequence I.D. No. 16) (Jaiswal et al., Science 228:80-83, 1989) Pos.1241 5′-CAT CCC CCA CAG CAC AAC AAG-3′ Pos. 1262 and:

Sequence I.D. No. 17 (Jaiswal et al., Science 228:80-83, 1989)

-   -   Pos. 1555.backslash./Pos. 1428

Pos. 1575 5′-ACA GCA GGC ATG CTT CAT GGG TCT CAC CGA TAC ACT TCC G-3′Pos. 1448

The lower outer primer used in conjunction with Sequence I.D. No. 18 toamplify both the competitive and native templates was

Sequence I.D. No. 18 (Jaiswal et al., Science 228:80-83, 1989) Pos. 15755′-ACA GCA GGC ATG CTT CAT GG-3′ Pos. 1555

The PCR amplification conditions were the same as described forexperiments using the competitive templates prepared for GAPDH and GSHPxby the Higuchi method except the annealing temperature was 55 degreescentigrade and the amplification was carried out for 38 cycles.

Because the native and competitive templates separate without priorrestriction endonuclease digestion, samples were taken directly from thePCR reaction tube and applied to ethidium bromide stained 3% Nusieve, 1%LE agarose gels. It was then possible to quantify the products by takinga negative photograph of the gel using Polaroid 665 positive/negativeinstant film, subjecting the negative photograph to densitometry.

RNA from BEP2D cells incubated for varying times with i6-napthoflavone(10 μM) was either electrophoresed on a 1% LE formaldehyde denaturinggel for Northern analysis or MC RT-PCR amplified, as described above.For Northern analysis, following transfer of the RNA to GeneScreen, thefilters were hybridized with ³²P-labeled CYPIAI cDNA.

The procedure used for PCR quantitation is as follows: Serial dilutionsof BEP2D cDNA (representing 0.25 μg to 0.05 μg total RNA) wereco-amplified with constant amounts of each single base mutated internalstandard competitive template (10 attamoles each), then analyzed asdescribed above.

FIG. 34 illustrates negative photographs of the gels analyzed bydensitometry in order to quantify each band. Starting with the areaunder each curve obtained by the densitometric evaluation of the bands,the ratios of native/competitive template amplified product werecalculated as follows. Corrections were made for relative band sizes.(Competitive template for GAPDH was multiplied by 788/588 when comparedto native nucleic acid for GAPDH and native GSH-Px was multiplied by354/280 when compared to competitive template for GSH-Px).

During PCR, under conditions in which primer is limiting, heterologoussingle strands of DNA with sequence homology may anneal to formheterodimers (Gilliland, G., Perrin, S., Blanchard, K. and Bunn, H. F.(1990) Proc. Natl. Acad. Sci. 87:2725-2729). When the heterologousstrands differ by only one base pair, as in this particular example, theheterologous strands can re-anneal randomly (Gilliland et al., supra;Thompson, J. D., Brodsky, I., and Yunis, J. J. (1992) Blood79:1629-1635), as shown in the Punnett square below: N M N NN NM M NM MMwhere N=the proportion of single-stranded native products prior tore-annealing, M=the proportion of single-stranded mutated products priorto re-annealing, NN (or N²)=the proportion of double-stranded nativeproducts after re-annealing, 2NM=the proportion of heterodimer formedafter re-annealing, and MM (or M²)=the proportion of double-strandedmutated products after re-annealing.

Heterodimers were accounted for indirectly because they were not cut, inthis example, by the restriction enzyme and had the same electrophoreticmobility as the undigested homodimer. Therefore, heterodimers were readdensitometrically along with the undigested homodimer. In order toquantitate products, based on the Punnett square distribution, randomheterodimer formation was promoted following PCR. This was done(according to the methods described in Gilliland et al., supra, andThompson et al., supra), by heating the products to 100° C. for 10 min.followed by slow cooling. Following promoted formation of heterodimers,the quantity of each product was determined by analysis of thedensitometric data using the quadratic formula as the formation ofheteroduplexes follows a binomial distribution under these conditions(Gilliland et al, Proc. Natl. Acad. Sci. 87:2725-2729 (1990),Becker-Andre et al., Nucleic Acids Res. 17:9437-9446 1989).

For GAPDH, in this example, neither the native product (NN) nor theheterodimer (NM) were cleaved by EcoRI. Therefore, the larger bandrepresented both native GAPDH homodimer (NN) and the NM heterodimer.This band was presented arithmetically by N²+2NM, according to thePunnett square, while the proportion in the band resulting from EcoRIcleavage was represented by the value M². Therefore, when the amount ofnative (N) and mutated (M) template are equal (1:1) prior to PCR, afterheterodimer formation is randomized, the apparent ratio will be3:1[N²+2NM):M²]. To illustrate this further, the raw densitometric datafrom the first sample lane (shown in FIG. 34) are shown in FIG. 35 andare mathematically processed to final ratios below:

The value of M² is known (2,214), as is the value of N²+2NM (10,095).From this information, M is calculated (47.05) and solving for N resultsin quadratic equation(aX ² +bX+c=0): N ²+2N(47.05)−10,095=0

The quadratic formula (N=−b±√[(b²−4ac)/2a] is used to solve for N. Inthis case, a=1, b=94.1, c=10,095, and thus N=63.89. The informationsought is the ratio N/M which is 63.89/47.10 or 1.36/1. (Althoughproportions of single-stranded DNA present after PCR are solved for,they will be identical to those of the corresponding double-stranded DNApresent prior to the PCR, in this example.)

Since densitometric values are relative, it is possible to avoid theinconvenience of using the quadratic formula by assigning the bandsproportionate densitometric values that when added=1 or (N²+2NM)+M²=1.Solving for this equation:(N ²+2NM)+M ²=(N+M)²=1 and therefore N+M=1

The relative fractions of 1 assigned to each of the bands is determinedby their respective densitometric values (FIG. 35). Since the totaldensitometric value of both bands is 12,309 (10,095+2,214), the relativeproportion of the larger band (N²+2NM) is 0.82 (10,095/12,309) and therelative proportion of the smaller band (M²) is 0.18 (2,214/12,309).Thus, the proportion of mutated GAPDH homodimer (M²) is 0.18, and theproportion of single-stranded mutated GAPDH (M) is 0.424. Since N+M=1,the proportion of single-stranded native GAPDH (N) is 1-0.424 or 0.576,and the ratio of native to mutated product is 0.576/0.424 or ascalculated above 1.36/1.

Next, in this example, the same calculations are carried out using thedensitometric values for native and mutated GSH-Px from the same lane asthe GAPDH values above (Table 1):N²=0.558, N=0.747, and M=1−0.747=0.253

Native/mutated ratios are obtained:GSH-Px native/mutated=0.747/0.253=2.95/1GAPDH native/mutated=0.576/0.424=1.36/1

Final values were expressed as an odds ratio (e.g., a “ratio ofratios”):GSH-Px native/mutated: GAPDH native/mutated=2.95/1.36=2.17/1

As FIG. 19 illustrates, the relationship between the amount of nativeproduct (in arbitrary densitometric units) and total starting RNA didnot remain linear throughout PCR amplification for either GSH-Px orGAPDH.

As FIG. 20 illustrates, however, the relationship of the ratios GSH-Pxnative/competitive template and GAPDH native/competitive template tototal starting RNA was linear for both genes. By averaging the ratio ofGSH-Px native/competitive template to GAPDH native/competitive templateobtained from sample tubes (2.17:1, 2.14:1, 2.00:1, 1.76:1, 2.46:1,2.71:1, and 1.92:1), a mean value of 2.17:1 with a S.D. of 0.33 wasobtained. In this example, no value varied more than 25% from the mean.

To assess the variability of this technique, the experiment was repeatedusing different dilutions of mutated (competitive template) standardsand master mixture. By averaging the ratio of GSH-Px native/competitivetemplate to GAPDH native/competitive template obtained from each sampletube in this example (1:9.09, 1:8.13, 1:9.43, 1:8.13, 1:6.62, 1:8.77,1:7.69, 1:10.00, 1:7.58, and 1:7.04), a mean value of 1:8.25 with a S.D.of 1.07 was obtained. In this example, no value varied more than 22%from the mean.

To assess the variability between samples using the same master mixtureand dilutions of mutated standards (using mutated GAPDH with novel BamHIrestriction site), BEP2D RNA was independently extracted from threeseparate flasks and reverse transcribed to cDNA. Five fold dilutions ofcDNA were performed. Four PCR tubes were run for each study. Theobtained ratios of GSH-Px native/competitive template to GAPDHnative/competitive template were 15.01:1, 17.69:1, and 21.76:1.(mean=18.15, S.D.=3.40). In this example, all of the 3 values werewithin 20% of the mean.

As FIG. 36 illustrates, similar increase in gene expression of theCYPIAI gene was observed in both Northern analysis and some embodimentsof methods disclosed herein. FIG. 36A illustrates Northern analysis ofRNA obtained from BEP2D cells that were treated with 0.1% DMSO as acontrol, or, β-napthoflavone in an effort to induce cytochrome p450 IA1(CYPIAI). FIG. 36B illustrates DNA PCR-amplified from serial dilutionsof cDNA from the same cells used in FIG. 36SA. The cDNA was co-amplifiedin the presence of competitive templates for GAPDH and CYPIAI, accordingto some embodiments of the instant invention.

Comparing the bands of the GAPDH reference gene representing native andcompetitive template cDNA indicates that approximately the same amountof cDNA was loaded in the lane with 1 μl of specimen from control cellsand the lane with 3 μl of specimen from β-napthoflavone exposed cells.Hence, the band representing the native CYPIAI gene is much morestrongly represented in the lane containing cDNA from β-napthoflavoneexposed cells compared to control cells.

To assess the variability of this technique, a repeat of the aboveexperiment was performed using different dilutions of competitivetemplates and master mixture. By averaging the ratio obtained from eachsample tube (1:9.09, 1:8.13, 1:9.43, 1:8.13, 1:6.62, 1:8.77, 1:7.69,1:10.00, 1:7.58, and 1:7.04), a mean value of the ratio of GSH-Pxnative/competitive template to GAPDH native/competitive template of1:8.25 with a S.D. of 1.07 was obtained. In this example, no valuevaried more than 22% from the mean, indicating the precision of thistechnique and the variability introduced by new master mixturescontaining new dilutions of competitive standards.

To assess the variability between samples using the same master mixtureand dilutions of competitive templates, BEP2D RNA was independentlyextracted from three separate flasks and reverse transcribed to cDNA.Only coarse (5 fold) dilutions of cDNA were performed. Four PCR tubeswere run for each study. The obtained ratios of GSH-Pxnative/competitive template to GAPDH native/competitive template were15.01:, 17.69: 1, and 21.76:1 (mean=18.15, S.D.=3.40). In this example,the 3 values were all within 20% of the mean, indicating the precisionof this technique when comparing samples that have been independentlyreverse transcribed but amplified with the same master mixture andinternal standard dilutions. Northern analysis of BEP2D RNA reveals aratio of GSHPx/GAPDH mRNA of approximately 1:8.

Example IV

Blinded Inter-Laboratory Study to Evaluate Reproducibility

In a first study, six laboratories participated in triplicatemeasurement of five genes in cDNA derived from a bronchogenic carcinomatissue sample 16009T. A variety of electrophoresis methods and imagingsoftware programs were used in different laboratories to analyzeamplified product. Study 1 Laboratory 2 used an Agilent 2100Bioanalyzer. The intra-laboratory average CV for all gene expressionmeasurements was 0.36, which is comparable to that previously reported(Willey et al, 1998; Rots et al, 1999; Rots et al; 2000; Mollerup et al,1999; Loitsch et al, 1999). The inter-laboratory variation showed anaverage CV of 0.71.

In a second study, slab gel electrophoresis and NIH Image software wasused to measure expression of 10 genes (the 5 previously measured plus 5additional genes) in A549 cDNA. Four of the original laboratories wereable to participate in the second study. The combined average CV for allnine genes that could be measured was 0.27 and 0.48 for intra-lab andinter-lab comparison, respectively. For TNF alpha, each laboratorydetermined that the expression was too low to be quantified. Of the fourlaboratories, three laboratories were able to quantify HNF3a while thefourth lab was not. The lower limit of detection of a PCR product abovebackground was established for the second study as an NIH imagearbitrary densitometric value of 5 above background. Although the fourthlaboratory observed NT and CT PCR products for HNF3 α, they were belowthe cut-off level of 5 and therefore not included in the analysis. A CTmix that contributed 60 molecules of nucleic acid CT (F mix) was used todetect HNF3α.

Example V

Comparison to Oligonucleotide Microarray

In a first study, Affymetrix oligonucleotide arrays and a embodiment ofthe two-step approach of the instant invention each were used to measureexpression of a total of 22 xenobiotic metabolism enzyme or antioxidantgenes in Human Oral Epithelial (HOE) cells. Expression in normal HOEcells was compared to expression in immortalized buccal epithelial cellline SVpgC2a. The difference in expression between HOE and SVpgC2a wascompared for each gene. Then, the differences detected by microarraywere compared to the differences detected by StaRT-PCR.

The cRNA chip method gives results based on relative signals that relyon perfect matches and mismatches from chip hybridizations, incombination with image software analysis, to derive results fromhybridization intensities. The technique allows for the analysis of upto 12,000 genes simultaneously and is semi-quantitative in terms ofsignal intensities from one hybridization experiment compared with otherhybridizations. In this study three sets of hybridizations utilizingsamples from normal HOE cells or SVpgC2a cells were performed, then datafrom each measured in SVpgC2a cells were compared to corresponding dataHOE cells. The first of the three hybridizations were performed with theHuFL 6800 chip.

Gene expression measurements were in close agreement and demonstratedthat the expression levels of several phase I and II metabolismtranscripts were similar in normal and immortalized keratinocytes. Ofthe CYP genes analyzed, most were expressed at low levels, i.e. below 20mRNA molecules/10⁶ β-actin mRNA molecules. In this study, cDNAconcentration allowed for the quantification of mRNA levels of 4molecules/10⁶ β-actin molecules.

Methods comprising embodiments of the instant invention were moresensitive. For example, transcripts that were found expressed at lowlevels were not detected with the chip method, i.e. transcriptsexpressed at levels below a few hundred molecules/10⁶ copies of β-actinwere not detectable with the chip hybridization method. Using methods ofthe instant invention, a gene expression value was obtained in both thenormal and immortalized cells for 14 genes of the 22 genes evaluated(for the remaining seven genes, expression was too low to be quantifiedin either the normal or immortalized cells). Of these 14 genes, a geneexpression value also was obtained by microarray analysis in both normaland immortalized cells for only five of them. The difference inexpression of these five genes in normal compared immortalized cellswere compared. The results are presented in FIG. 37.

In a second study, gene expression was measured for the Stratagene HumanRNA Reference. The Stratagene Human RNA Reference comprises RNA from 10cell lines mixed together, to represent transcription levels of a largefraction of genes in the human genome. Genes were evaluated usingOligonucleotide microarray (Affymetrix U95 version2 and HuGenFLgenechips) and using embodiments of the instant invention.

Using embodiments of the instant invention, data for 163 of the 192genes represented in G.E.N.E. Systems 1 and 2 CT mixes were obtainedfrom the Stratagene Human RNA Reference. The remaining genes representedin Systems 1 and 2 were expressed at a value too low to be quantified(less than 6 molecules/10⁻⁶ β-actin molecules). Of the 163 genesmeasured, 85 were represented on the HuGenFL gene chip. Of these 85genes, it was possible to assign an expression value to all of the genesmeasured in accordance with embodiments of the instant invention, butonly 41 genes based on microarray analysis.

Bivariate Correlation Analysis: FIG. 38 shows the Pearson correlationfor some two step embodiments vs. microarray values. The Pearsoncorrelation had an r² value of 0.373, which was highly significant(P<0.001).

Sensitivity: FIG. 3 also compares sensitivity between some two-stepembodiments and microarrays. Among the 41 genes for which expressionvalues were obtained by both, values ranged over about two logs inmicroarray analysis and about three logs by some embodiments of two-stepanalysis. FIG. 38. Some embodiments of two-step analysis were about10-fold more sensitive than microarray analysis

Example VI

The following example details gene expression measurement in thebusiness method provided as a service.

Automated preparation of reactions: A PerkinElmer (Boston, Mass., USA)robotic liquid handler is used to prepare 10 μL PCR reactions in 96- or384-well microplates. First, the liquid handler is programmed todistribute 1 μL of primers for genes to be measured into wells of themicroplates. Second, for each cDNA, a sufficient volume of PCR mixturefor the anticipated number of gene expression measurement is prepared,containing buffer, Taq polymerase, dNTPs, cDNA and internal standards.The robot then distributes 9 μL of this PCR reaction mixture into eachwell. Thus, in each well, the internal standard competitive templatesfor each gene and cDNA are present in the same ratio. However, becauseonly one pair of primers is present in each well, only one gene and itsrespective internal standard competitive template are amplified in eachwell. Following 35 cycles of PCR, each microplate is transferred to aAMS 90 SE30 high-throughput microfluidic device (Caliper/Zymar,Hopkinton, Mass., USA) for analysis.

Design of High Throughput Gene Expression Measurement

Step 1 amplification of 96 genes: Competitive templates can be combinedinto groups of 96 and named as sequential “Systems”. Thus, the first mixof CTs representing 96 genes was called System 1 and so on. A mix ofprimer pairs specific for each of the 96 genes in a System can becombined and diluted to a concentration of 0.05 μg. Thus, for each ofthe Systems 1-4 CT mixes representing 96 genes, there can be a mix ofprimers corresponding to the same 96 genes.

The cDNA sample can be diluted so that it is in balance with, i.e.,calibrated to, approximately 600,000 molecules of β-actin from Mix Dfrom each System. The appropriate amount of cDNA then will bePCR-amplified in the presence of the primer mix from one of the Systems,and Mix B, C, D, E, or F from the corresponding system. In this way, PCRproducts can be generated in each of 20 separate 10 μl PCR reactions,namely System 1, Mixes B, C, D, E, and F; System 2, Mixes B, C, D, E,and F; System 3, Mixes B, C, D, E, and F; and System 4, Mixes B, C, D,E, and F.

Thus far, the amount of cDNA and CT mix for 20 reactions has beenconsumed. Next, the PCR products included in these 20 PCR reactions canbe used to measure all 384 genes included in Systems 1-4.

Step 2: Initial Single gene PCR Amplification from Mix D PCR Products:The PCR products generated above in round one may be diluted up to10,000-fold and still yield quantifiable PCR products in a second roundof PCR amplification. Because an internal standard CT was included ineach PCR reaction, and because the amplification efficiency of theinternal standard CT is the same as the NT, a gene expression valueobtained after a second 35-cycle round of PCR amplification can be thesame as that obtained after the first 35 cycles. For example, 2 μl fromthe first round of amplification can be diluted 100-fold for use in thesecond round. Because there were 10 μl in the first round PCR reactionvolume, this constitutes a 1000-fold dilution. One half of this dilutedround one PCR reaction from Mix D for each System (100 μl) then can bemixed with appropriate amount of dNTPs, taq, and dH₂O and aliquoted intoeach of 96 wells, with each well containing a different pair of primers(representing a gene from the corresponding System) dried on the bottom.The remaining 100 μl can be saved for step 3. Thus, a Step 2 PCRreaction mixture containing diluted Step 1 Mix D PCR product can beprepared for each of the four Systems, and distributed in 10 μl aliquotsinto each of 96 wells on the 384 Well microplate.

Following PCR amplification, 1 μl of PCR product from each PCR reactioncan be transferred into a well of a DNA 1000 chip mounted in an Agilent2100. Each sample can be electrophoresed. The remaining PCR product canbe retrieved into microfuge tubes to provide back-up material in theeven of trouble with the electrophoresis.

Step 3: Selection of the most appropriate CT Mix Step 1 product forquantifying each of the 96 genes, based on the results of Step Two: Onlyabout 50% of the NT and CT for various genes are expected to be inbalance. Genes not in balance can be re-evaluated using a Mix containingdifferent concentrations of CTs for those genes relative to β-actin CT.In round 3, expression of each gene can be evaluated using a Step 1 PCRreaction that contained an appropriate concentration of CTs for thosegenes relative to β-actin CT. A 3,900 μl PCR reaction mixture can beprepared, containing 3.9 μl of the Step 1 PCR product, and appropriatevolume of taq polymerase, dNTPs and distilled H₂O.

384-well microplates with primers for different single genes dried tothe bottom of each well can be prepared ahead of time. Such plates maybe stored at 4 degrees C. for months without loss or decrease in primerfunction. Nonetheless, a decrease in primer efficiency over time wouldnot be expected to change gene expression measurement numerical values,because each PCR reaction contains an internal standard CT. However, itcould decrease amount of amplified product for both NT and CT, reducingthe signal to be quantified an Agilent 2100.

Cost Analysis of Providing Business Method Embodiments as a Service

Using two 384-well microplate thermocyclers, two cDNA samples can bescreened in eight different 96-gene Step 1 reactions. Four of thesereactions can contain cDNA from Sample 1 and Mix D from Systems 1-4respectively, and four reactions can contain cDNA from Sample 2 and MixD from Systems 1-4 respectively. PCR products from each of thesereactions can be diluted 100-fold. Eight PCR reaction mixtures, eachsufficient for 96 PCR reactions can be prepared with 10% volumerepresented by one of the eight Step 1 PCR products. The PCR reactionmixtures including Sample 1 then can be dispensed into the appropriate96 wells of one of the 384-well microplates, and the PCR reactionmixtures including Sample 2 can be dispensed into the appropriate 96wells of the other 384-well plate. Alternatively, when Systems 5-8 areused, both 384-well microplates can be used to screen 768 genes inSample I with Mix D.

Preparing the eight different 96-gene PCR reactions can require a totalof 8 μl of cDNA, and 8 μl of CT Mix D. The primary cost can be taqpolymerase and primers for 768 PCR reactions, approximately $25.00 and$30.00 respectively. The primary cost of analysis may be the Agilentchips, costing $12/chip. By applying 4 PCR products/channel, 48genes/chip can be analyzed. Thus, the cost of materials per assay can beapproximately $0.32. Labor costs can include one day of work atapproximately $14.00/hour=$112.00, adding about $0.15/assay to makeabout $0.47/assay. There can be about two more days of data input andanalysis, bringing the total cost to $0.77/assay. Based on thesecalculations, the fee can be approximately $1.00/gene expressionmeasurement.

The above detailed description of the present invention is given forexplanatory purposes. It will be apparent to those skilled in the artthat numerous changes and modifications can be made without departingfrom the scope of the invention. Accordingly, the whole of the foregoingdescription is to be construed in an illustrative and not a limitativesense, the scope of the invention being defined solely by the appendedclaims.

While preferred embodiments of the present invention have been'shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein may be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand compositions within the scope of these claims and their equivalentsbe covered thereby.

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each individual publication, patent or patent application wasspecifically and individually indicated as being incorporated byreference.

1. A business method comprising: collecting a first specimen comprisinga first nucleic acid; measuring an amount of said first nucleic acid ina first sample of said first specimen wherein said measuring canenumerate less than about 1,000 molecules of said first nucleic acid insaid first sample; and providing said amount as a numerical valuewherein said numerical value allows direct comparison to an amount ofsaid first nucleic acid in a second sample.
 2. The method as recited inclaim 1 wherein said first and said second samples are measured atdifferent times.
 3. The method as recited in claim 1 wherein said firstand said second samples are collected from different subjects.
 4. Themethod as recited in claim 1 wherein said measurement provides acoefficient of variation of less than about 50% for said first nucleicacid.
 5. The method as recited in claim 1 wherein said measuring step isperformed at least about 100 times per day.
 6. The method as recited inclaim 1 wherein said first nucleic acid comprises an RNA molecule. 7.The method as recited in claim 1 wherein said first nucleic acidcomprises a DNA molecule.
 8. The method as recited in claim 1 whereinsaid method comprises an automated step.
 9. The method as recited inclaim 1 wherein said method comprises a use of microfluidic capillaryelectrophoresis, an oligonucleotide array, mass spectrometry, orchromatography.
 10. The method as recited in claim 1 wherein said firstspecimen comprises at least about 1,000 cells.
 11. The method as recitedin claim 1 wherein said first specimen comprises a human specimen. 12.The method as recited in claim 11 wherein said human specimen iscollected without identifying information.
 13. The method as recited inclaim 11, further comprising collecting information attesting tocompliance with investigative protocol.
 14. The method as recited inclaim 13 wherein said identifying information is collected at a latertime than said collection of said first specimen.
 15. The method asrecited in claim 1, further comprising collecting information selectinga number of nucleic acids in said first specimen to be assessed.
 16. Themethod as recited in claim 15 wherein said information is collected viaa website.
 17. The method as recited in claim 16, further comprisingidentifying which of said selected nucleic acids electrophoresistogether.
 18. The method as recited in claim 17 wherein amounts of saididentified nucleic acids are electrophoresed simultaneously.
 19. Themethod as recited in claim 1 wherein said numerical value is providedvia e-mail.
 20. The method as recited in claim 1 wherein said assessingcomprises: providing a standardized mixture comprising a competitivetemplate for said first nucleic acid and a competitive template for asecond nucleic acid in said first specimen wherein said competitivetemplates are at known concentrations relative to each other; combiningsaid standardized mixture with a first sample of said specimen,co-amplifying said first nucleic acid and said competitive template forsaid first nucleic acid to produce fist amplified product thereof;diluting said first amplified product; further co-amplifying saiddiluted first amplified product of said first nucleic acid and of saidcompetitive template for said first nucleic acid, to produce secondamplified product thereof; and co-amplifying said second nucleic andsaid competitive template for said second nucleic acid to produce firstamplified product thereof.
 21. The method as recited in claim 20,further comprising: obtaining a first relationship, said firstrelationship comparing said second amplified product of said firstnucleic acid and said second amplified product of said competitivetemplate for said first nucleic acid; obtaining a second relationship,said second relationship comparing said first amplified product of saidsecond nucleic acid and said first amplified product of said competitivetemplate for said second nucleic acid; and comparing said first and saidsecond relationships.
 22. The method as recited in claim 21 wherein saidsecond nucleic acid serves as a reference nucleic acid.
 23. The methodas recited in claim 22 wherein said standardized mixture furthercomprises sufficient amounts of said competitive templates for assessingsaid first nucleic acid in more than about 106 samples.
 24. A businessmethod of improving drug development, comprising: collecting a firstspecimen comprising a nucleic acid from a first biological entityadministered a candidate drug at first stage of drug development;collecting a second specimen comprising said nucleic acid from a secondbiological entity at a second stage of drug development; assessing anamount of said nucleic acid in each of said first and said secondspecimen; directly comparing said amounts; and altering a step of saiddrug development based on said comparison.
 25. The method as recited inclaim 24 wherein said first or said second biological entity is at leastone entity selected from a virus, a cell, a tissue; an in vitro culture,a plant, an animal, and a subject participating in a clinical trial. 26.The method as recited in claim 24 wherein said first or said secondstage of drug development comprises at least 2 stages selected from drugtarget screening, lead identification, pre-clinical validation, clinicaltrial and patient treatment.
 27. The method as recited in claim 26wherein said pre-clinical validation comprises a bioassay and/or ananimal study.
 28. The method as recited in claim 1 wherein said alteringcomprises a stratification of a clinical trial.
 29. The method asrecited in claim 28 wherein said stratification involves identifyingsubjects to have a reduced side effect.
 30. The method as recited inclaim 1 wherein said altering reduces the time for said drugdevelopment.
 31. A business method of improving drug development,comprising: providing a database comprising numerical valuescorresponding to amounts of a first nucleic acid in a number of sampleswherein said numerical values are directly comparable between 5 of saidsamples; collecting a first specimen comprising said first nucleic acidfrom a biological entity administered a candidate drug at a stage ofdrug development; assessing an amount of said first nucleic acid in afirst sample of said first specimen; directly comparing said amount toat least one of said numerical values in said database; and altering astep of said drug development based on said comparison.
 32. The methodas recited in claim 31 wherein said biological entity is at least oneentity selected from a virus, a cell, a tissue, an in vitro culture, aplant, an animal, and a subject participating in a clinical trial. 33.The method as recited in claim 31 wherein said stage of drug developmentcomprises at least one stage selected from drug target screening, leadidentification, pre-clinical validation, clinical trial and patienttreatment.
 34. The method as recited in claim 33 wherein saidpre-clinical validation comprises a bioassay and/or an animal study. 35.The method as recited in claim 31 wherein said altering comprises astratification of a clinical trial.
 36. The method as recited in claim35 wherein said stratification involves identifying subjects to have areduced side effect.
 37. The method as recited in claim 31 wherein saidaltering reduces the time for said drug development.
 38. A businessmethod of improving drug development, comprising: providing a databasecomprising numerical indices, said numerical indices obtained bymathematical computation of 2 numerical values corresponding to amountsof 2 nucleic acids in a number of samples wherein said numerical indicesare directly comparable between 5 of said samples; collecting a firstspecimen comprising said 2 nucleic acids from a biological entityadministered a candidate drug at a stage of drug development; assessingan amount of each of said 2 nucleic acids in a first sample of saidfirst specimen; using said 2 amounts to mathematically compute a firstnumerical index; directly comparing said first numerical index to atleast one of said numerical indices in said database; and altering astep of said drug development based on said comparison.
 39. The methodas recited in claim 38 wherein said biological entity is at least oneentity selected from a virus, a cell, a tissue, an in vitro culture, aplant, an animal, and a subject participating in a clinical trial. 40.The method as recited in claim 38 wherein said stage of drug developmentcomprises at least one stage selected from drug target screening, leadidentification, pre-clinical validation, clinical trial and patienttreatment.
 41. The method as recited in claim 40 wherein saidpre-clinical validation comprises a bioassay and/or an animal study. 42.The method as recited in claim 38 wherein said altering comprises astratification of a clinical trial.
 43. The method as recited in claim42 wherein said stratification involves identifying subjects to have areduced side effect.
 44. The method as recited in claim 38 wherein saidaltering reduces the time for said drug development.